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EXPRESS MAIL NO. EM175691585US 



NOVEL EPITHELIAL TISSUE TARGETING AGENT 



CROSS-REFERENCE TO RELATED APPLICATION 
5 This application is a continuation-in-part of United States Patent 

Application No. 08/782,481, filed January 10, 1997. 

TECHNICAL FIELD 

The present invention relates generally to the targeting of therapeutic 
compounds to specific cells and tissues. The invention is more particularly related to 
10 targeting molecules for use in delivering compounds to epithelial tissue. Such targeting 
molecules may be used in a variety of therapeutic procedures. 

BACKGROUND OF THE INVENTION 

Improving the delivery of drugs and other agents to target tissues has 
been the focus of considerable research for many years. Most agents currently 

1 5 administered to a patient parenterally are not targeted, resulting in systemic delivery of 
the agent to cells and tissues of the body where it is unnecessary, and often undesirable. 
This may result in adverse drug side effects, and often limits the dose of a drug (e.g., 
cytotoxic agents and other anti-cancer or anti-viral drugs) that can be administered. By 
comparison, although oral administration of drugs is generally recognized as a 

20 convenient and economical method of administration, oral administration can result in 
either (a) uptake of the drug through the epithelial barrier, resulting in undesirable 
systemic distribution, or (b) temporary residence of the drug within the gastrointestinal 
tract. Accordingly, a major goal has been to develop methods for specifically targeting 
agents to cells and tissues that may benefit from the treatment, and to avoid the general 

25 physiological effects of inappropriate delivery of such agents to other cells and tissues. 

In addressing this issue, some investigators have attempted to use 
chimeric molecules that bind to growth factor receptors on gastrointestinal epithelial 
cells to facilitate transepithelial transport of therapeutic agents (see WO 93/20834). 
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However, these methods have several disadvantages. For example, such chimeric 
molecules are transcytosed through the epithelium from the gut lumen and absorbed 
into the blood stream, resulting in systemic distribution and removal from the 
epithelium proper. Since the therapeutic agents are targeted specifically away from the 
5 epithelium for systemic distribution, these chimeric molecules are generally not useful 
for treatment of epithelium associated conditions. In addition, TGF-a or other 
molecules binding to EGF receptors exhibit many or all of the apparent biological 
activities of EGF, such as stimulation of enterocyte mitogenesis or suppression of 
gastric secretion. Such effects collateral to the transcytotic uptake of therapeutic agents 

10 may not be desirable or may be contraindicated for intervention of epithelium 
associated conditions or diseases. Furthermore, EGF receptors are not unique to 
epithelial cells of the gastrointestinal tract, and can be found on numerous other cells 
including kidney cells and hepatocytes. Thus, molecules which have affinity for the 
EGF receptor and are distributed systemically in the blood can be rapidly removed from 

1 5 circulation, accumulated in specific organs and potentially degraded or secreted. 

Within an alternative approach, other investigators have employed Fab 
fragments of an anti-polymeric immunoglobulin receptor IgG to target DNA to 
epithelial cells in vitro that contain such a receptor {see Ferkol et al., J. Clin. Invest 
92:2394-2400, 1993). Still other researchers have described the translocation of a 

20 chimeric IgA construct across a monolayer of epithelial cells in vitro {see Terskikh et 
al., Mol Immunol 57:1313-1319, 1994). Others have used ascites tumor implants in 
vivo in mice and observed an IgA dimeric antibody produced by subcutaneous tumor 
cells to accumulate in feces, suggesting that IgA is transported across an epithelial 
barrier of the gastrointestinal tract {see Greenberg et al., Science 272:104-107, 1996). 

25 Notwithstanding the above-noted developments, there remains a need in 

the art for systems for delivering agents to target cells, particularly epithelial cells and 
cells or tissues bounded by epithelial cells. The present invention fulfills these needs 
and further provides other related advantages. 
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SUMMARY OF THE INVENTION 

Briefly stated, the present invention provides targeting molecules for the 
specific delivery of biological agents to epithelial cells and tissues. In several aspects, 
the present invention provides a targeting molecule linked to at least one biological 
5 agent. In one such aspect, the targeting molecule comprises a polypeptide that (a) forms 
a closed covalent loop; and (b) contains at least three peptide domains having p-sheet 
character, each of the domains being separated by domains lacking p-sheet character; 
wherein the polypeptide is not a full length dimeric IgA. In specific embodiments, the 
polypeptide further contains one or more of the following additional domains: a fourth 

10 peptide domain having p-sheet character, separated from other domains having p-sheet 
character by a domain lacking P-sheet character; a linear N-terminal domain; and a C- 
terminal domain, which may comprise a linear peptide having p-sheet character and/or a 
covalently closed loop. 

Within other such aspects, the targeting molecule comprises a sequence 

15 recited in any one of SEQ ID NO:l - SEQ ID NO:8 and SEQ ID NO:13. 

In a further related aspect, the present invention provides a targeting 
molecule capable of specifically binding to a basolateral factor associated with an 
epithelial surface and causing the internalization of a biological agent linked thereto, 
wherein the targeting molecule is not full length dimeric IgA. 

20 Within related aspects, the targeting molecule comprises a polypeptide 

that: (a) forms a closed covalent loop; and (b) contains at least three peptide domains 
having p-sheet character, each of the domains being separated by domains lacking 0- 
sheet character; wherein the targeting molecule is linked to at least one biological agent 
by a substrate for an intracellular or extracellular enzyme associated with or secreted 

25 from an epithelial barrier, or by a side chain of an amino acid in an antibody combining 
site. 

Within further related aspects, the targeting molecule is linked to at least 
one biological agent, wherein the targeting molecule comprises a polypeptide that: (a) 
forms a closed covalent loop; and (b) contains at least three peptide domains having p- 
30 sheet character, each of the domains being separated by domains lacking P-sheet 
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character; wherein the biological agent is not naturally associated with the targeting 
molecule, and wherein the biological agent is not iodine. 

Within another aspect, the present invention provides a pharmaceutical 
composition comprising a targeting molecule linked to at least one biological agent, as 
5 described above, in combination with a pharmaceutically acceptable carrier. 

In further aspects, methods are provided for treating a patient afflicted 
with a disease associated with an epithelial surface, comprising administering to a 
patient a pharmaceutical composition as described above. Such diseases include cancer, 
viral infection, inflammatory disorders, autoimmune disorders, asthma, celiac disease, 
10 colitis, pneumonia, cystic fibrosis, bacterial infection, mycobacterial infection and 
fungal infection. 

Within related aspects, the present invention provides methods for 
inhibiting the development in a patient of a disease associated with an epithelial surface, 
comprising administering to a patient a pharmaceutical composition as described above. 
1 5 These and other aspects of the present invention will become apparent 

upon reference to the following detailed description and attached drawings. All 
references disclosed herein are hereby incorporated by reference in their entirety as if 
each was incorporated individually. 

BRIEF DESCRIPTION OF THE DRAWINGS 

20 Figure 1 is a comparison of native J chain sequences reported for human 

(top line) (SEQ ID NO:l), mouse (second line) (SEQ ID NO:2), rabbit (third line) (SEQ 
ID NO:3), cow (fourth line) (SEQ ID NO:4), bull frog (fifth line) (SEQ ID NO:5) and 
earth worm (sixth line) (SEQ ID NO:6). For each non-human sequence, amino acid 
residues that are identical to those in the human sequence are indicated by a dash. 

25 Residues that differ from the human sequence are indicated using standard one letter 
abbreviations. 
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DETAILED DESCRIPTION OF THE INVENTION 

As noted above, the present invention is generally directed to targeting 
molecules (TMs) for use in the delivery of drugs and other biological agents to 
epithelial cells. Upon delivery to an epithelial cell, the agent may remain within the cell 
5 or may undergo transepithelial transport via transcytosis. For example, the agent and 
TM may be transported across the basolateral surface and remain within the epithelial 
cell, or the agent may remain within the cell while the TM undergoes transepithelial 
transport. Agents that remain within the epithelial cell may modify an activity or 
function of a cellular component or a foreign component, such as a virus. Alternatively, 

10 both the agent and TM may undergo transcytosis. For example, an agent linked to a 
TM may pass through an epithelial cell surface to access an adjacent cell, tissue or 
compartment (e.g., lumen of the small intestine, bronchial airway, vaginal cavity), 
and/or may bind a substance within an epithelial cell and then remove the substance 
from the cell. Further, an agent may (but need not) be designed to be inactive when 

1 5 entering the epithelial cell, and be activated following transcytosis or upon a specific 
event (e.g., viral infection). 

Prior to setting forth the present invention in detail, definitions of certain 
terms used herein are provided. 

Epithelial surface (or epithelial barrier): A surface lining the exterior of 

20 the body, an internal closed cavity of the body or body tubes that communicate with the 
exterior environment. Epithelial surfaces include the genitourinary, respiratory, 
alimentary, ocular conjunctiva, nasal, oral and pharyngeal cavities, as well as the ducts 
and secretory portions of glands and receptors of sensory organs. The term "epithelial 
surface" as used herein is synonymous with "epithelial barrier." One side of an 

25 epithelial surface is free of adherence to cellular and extracellular components, other 
than coating substances and secretions. The other side of the surface is normally 
adjacent to the basement membrane and is exposed to interstitial fluids and components 
of the underlying tissues. Epithelial surfaces are typically formed from cells in close 
apposition to one another, the contact between plasma membranes of adjacent cells 

30 characterized by a tight junction (zonula occludens) which delimits the outside and 
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inside domains of an epithelial surface. An experimental epithelial-like surface can be 
generated in vitro with autonomously replicating cell lines {e.g., MDCK, ATCC No. 
CCL34; HEC-1A, ATCC No. HTB 112), which form epithelial-like surfaces in culture, 
have tight junctions and articulate one free (apical) and one adherent (basolateral) 
5 domain. 

Apical domain : The outside of an epithelial surface which is adjacent to 
the environment external to the body or to the volume of a body cavity or body tube. 
The outside of the cells, as delimited by the zonula occludens, is composed of the 
coating substances, secretions and cell membranes facing the outside of the epithelial 
1 0 surface. 

Luminal compartment : The inner space of a body tube, cavity or duct 

lined by an epithelial surface and adjacent to the apical domain. 

Basolateral domain : The inside of the epithelial surface which is 

delimited from the apical domain by the zonula occludens. The basolateral domain is 
1 5 adjacent to the basement membrane and is exposed to interstitial fluids and components 

of the tissues underlying epithelial surfaces. The basolateral domain is the inner side of 

cells of an epidermal surface. 

Basolateral membrane : The portion of the plasma membrane of a cell of 

an epithelial surface which is within the basolateral domain. 
20 Basolateral factor : A component of the basolateral domain which is a 

naturally occurring element of a basolateral membrane in vivo. A "basolateral factor 

associated with an epithelial surface" refers to a basolateral factor attached by covalent 

or noncovalent bonds to a basolateral domain, or a component of the membrane proper 

in a basolateral domain. 
25 Internalization : The process of uptake into a cell compartment that is 

bounded by a plasma membrane. 

Specific binding : A TM specifically binds to a basolateral domain if it 

specifically interacts at the basolateral domain of an epithelial surface. Both 

quantitative and qualitative assays may be used to distinguish specific binding from 
30 binding which is not specific within the context of the subject invention. A quantitative 
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measurement of binding affinity (k aff ) may be used to identify components that bind 
specifically. In general, a k a{f of 10 4 M" 1 or higher constitutes specific binding between 
two binding components. The binding affinity for the cognate components of a binding 
interaction can be estimated experimentally by a variety of methods that are well known 
5 in the art, including equilibrium dialysis assays, precipitation radioimmunoassays, 
assays with immobilized ligands, assays with isolated cells or membranes, ELISAs, or 
by other direct or indirect measurements or binding {e.g., plasmon resonance). 

Qualitative specificity of binding is demonstrated by differential, or 
asymmetric distribution of binding of a factor among two or more chemical, spatial or 
10 temporal domains. This differential distribution can be observed visually, or by 
chemical or physical means, and generally reflects approximately at least a 3 to 1 
differential in signal intensity between basolateral and non-basolateral domains. Such 
qualitative specificity may result from substantial differences in the affinity of binding 
of an agent to one of several domains, or to the number or availability of cognate 
1 5 binding sites on a domain. The qualitative specificity of binding of an agent among 
several domains can be observed in a competition experiment. In such an experiment a 
TM is allowed to distribute among domains, and at equilibrium is observed to 
preferentially bind to one domain over another. 

Targeting Molecule (TIVO : A molecule capable of specifically binding to 
20 a cognate factor on epithelial surfaces, which is not uniformly distributed. 

Biological agent : Any molecule, group of molecules, virus, component 
of a virus, cell or cell component that is synthesized by a cell or ex vivo, can be derived 
from a cell and/or can be demonstrated to modify the properties of a cell. Biological 
agents include therapeutic agents (i.e., drugs and other medicinal compounds useful for 
25 treating or preventing a disorder or regulating the physiology of a patient). 

Linked : A biological agent is linked to a TM if it is attached covalently, 
by ionic interaction and/or by hydrophobic interactions, or by other means such that 
under physiological conditions of pH, ionic strength and osmotic potential the linked 
entities are associated with each other at equilibrium. 
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TMs as described herein are generally capable of specifically binding to 
a factor preferentially distributed on an epithelial surface, such as a basolateral factor. 
Through binding to such a factor, TMs are capable of causing the internalization of a 
biological agent linked to the TM. TMs as described herein have a distinct three- 
5 dimensional structure. In general, TMs comprise a polypeptide that forms a closed 
covalent loop which is referred to herein as the "core." All subunits of the polypeptide 
may, but need not, be connected by identical chemical bonds. In a preferred 
embodiment, the polypeptide comprises amino and/or imino acids covalently joined by 
peptide bonds and one or more cystine disulfide bridges. 

10 The core of a TM typically contains at least three peptide domains 

having p-sheet character, interspersed among regions lacking P-sheet character. In this 
regard, a "peptide domain" is a portion of a polypeptide comprising at least three amino 
acid residues. A peptide domain is said to have P-sheet character if the peptide 
backbone has an extended conformation with side-chain groups in a near planar and 

15 alternating arrangement such that hydrogen bonding can occur between carbonyl and 
NH groups of the backbone of adjacent p-strands. Furthermore, TMs generally contain 
at least one cysteine residue not present within an intramolecular cystine. Such 
cysteine(s) may be used for linking one or more biological agents to the TM, although 
other means of linking biological agents are also contemplated. 

20 One or more of a variety of other structures may, but need not, be 

additionally present within a TM. For example, a second peptide loop may be present 
within the core sequence. Additional N-terminal and/or C-terminal sequences may be 
present. If present, N-terminal sequences are usually linear. A preferred N-terminal 
sequence is a short (about 1-20 amino acid residues) peptide domain. C terminal 

25 sequences may be linear and/or may form one or more loops. Such sequences may, but 
need not, possess domains having p-sheet character. These and/or other protein 
domains may be added to the core by genetic means or chemically, using covalent 
bonds or noncovalent interactions. 

In a preferred embodiment, a TM comprises all or a portion of a native J 

30 chain sequence, or a variant thereof. J chain is a 15 kD protein that, in vivo, links IgM 
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or IgA monomers to form pentameric IgM or dimeric IgA (see Max and Korsmeyer, J. 
Exp. Med. 757:832-849, 1985). To date, sequences of J chains from six organisms have 
been deduced (see Figure 1 and SEQ ID NO:l - SEQ ID NO:6; Kulseth and Rogne, 
DNA and Cell Biol 73:37-42, 1994; Matsuuchi et al., Proa Natl Acad Set USA 
5 53:456-460, 1986; Max and Korsmeyer, J. Exp. Med. 767:832-849, 1985; Hughes et al., 
Biochem J. 271:641-647, 1990; Mikoryak et al., J. Immunol. 740:4279-4285, 1988; 
Takahashi et al., Proc. Natl. Acad ScL USA 03:1886-1891, 1996). A TM may 
comprise a native J chain from one of these organisms, or from any other organism. 

Alternatively, a TM may comprise a portion or variant of native J chain 

10 sequence. A variant is a polypeptide that differs from a native a sequence only in one or 
more substitutions and/or modifications. Portions and variants of the native J chain 
sequence contemplated by the present invention are those that substantially retain the 
ability of the native J chain to specifically bind to a basolateral factor associated with an 
epithelial surface, and cause the internalization of a linked biological agent. Such 

15 portions and variants may be identified using, for example, the representative assays 
described herein. 

Within the context of the TM compositions provided herein, the TM is 
not full length dimeric IgA. More specifically, the TM does not contain all of the 
components present within a naturally-occurring IgA (i.e., a heavy chain containing 

20 contiguous variable, C H la, C H 2a and C H 3cc domains and a light chain containing 
contiguous variable and C L domains). Such a TM may, of course, contain one or more 
portions of an IgA molecule, including an IgM. 

As noted above, specific binding may be evaluated using quantitative 
and/or qualitative methods. In one representative quantitative assay, secretory 

25 component (SC) isolated from human milk by standard immunoaffinity 
chromatography methods (Underdown et al., Immunochemistry 74:111-120, 1977) is 
immobilized on a CM5 sensor chip with a BIACORE apparatus (Pharmacia, 
Piscataway, New Jersey) by primary amine coupling. The sensor chip is activated by 
injection of 30 \xL of 0.05M N-hydroxysuccinimide and N-ethyl-N-(3- 

30 diethylaminopropyl)carbodiimide, followed by injection of 25 \iL of human SC (15 
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pig/mL) in lOmM sodium acetate, pH 5.0. Unreacted carbodiimide is then quenched 
with 30 jiL ethanolamine. All reagents are delivered at a flow rate of 5 \xL per minute. 
To evaluate the kinetics of binding and desorption, serial two fold dilutions of TMs at 
concentrations between 100 (aM and 100 nM are injected in binding buffer: 25 mM 
5 Tris, pH 7.2, 100 mM NaCl, 10 mM MgCl 2 at a flow rate of 20 per minute. 
Between dilutions, the surface is regenerated by injecting 50 (aL of 25mM Tris, pH 7.2, 
200 mM NaCl, 2M urea, followed by injecting 50 |uL of binding buffer. Association 
and dissociation constants are derived from sensograms using BIAevaluation 2.1 
software to derive simple association(k a ) and dissociation constants(kd). The K a ff is 

10 estimated as k a /kd. 

In one representative qualitative assay, monolayers of HEC-1 A cells can 
be used to measure qualitative binding of TMs. The procedure is based on previously 
published protocols (see Ball et al., In Vitro Cell Biol. 57:96, 1995). HEC-1A cells are 
cultured on 24 mm filter transwells (Costar, #3412, 0.4 |^m) for one week until cells are 

1 5 confluent. Monolayer-covered filter transwells are washed twice on both surfaces with 
cold PBS (4°C). One ml of cold MEM-BSA containing 1.0 jag of biotinylated ligand is 
added to the apical chamber and 1.5 ml cold MEM-BSA buffer (MEM-BSA (4°C): 
minimum essential medium with hank's salts, and 25 mM HEPES buffer without L- 
glutamine (Life Technologies, Gaithersburg, Maryland; Cat. No. 12370) containing 

20 0.5% BSA, which is treated at 56°C for 30 min to inactivate endogenous protease and 
filter sterilized) containing 1.5 jug of biotinylated ligand is added to the basolateral 
chamber. The cultures are kept at 4°C for 2 hours to achieve maximum binding in the 
absence of internalization. The medium is removed from both chambers, and the filters 
are washed twice with cold PBS. Filters are then remove from the transwell supports 

25 with a scalpel and incubated with a streptavidin-fluorescein conjugate (#21223, Pierce 
Chemical Company, Rockford, Illinois), 0.1 |ag/mL in cold PBS, then washed 3 times 
with cold PBS. 1cm square pieces of filter are then cut from the 24mm filter and 
mounted on microscope slides and observed microscopically under epifluorescence 
illumination (excitation 490nm, emission 520nm). Under these conditions the apical 

30 membranes show little or no fluorescence, while basolateral membranes demonstrate 
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bright fluorescence (i.e., greater than a 3 to 1 differential in signal intensity) indicating 
specific binding to the basolateral domain. Similar assays can be employed with 
isolated epithelial tissues from gastrointestinal, oral or bronchial epithelial tissue layers. 

Once bound to the basolateral domain of an epithelial cell, a TM may be 
5 internalized within a cell of an epithelium-like monolayer. Suitable cells for evaluating 
internalization include MDCK cells expressing the human polyimmunoglobulin 
receptor (plgR) (see Tamer et aL, J. Immunol 155:101-114, 1995) and HEC1-A cells. 
One assay in which internalization can be observed employs a HEC1-A cell line grown 
to confluent monolayers on permeable membrane supports (such as Costar, Cambridge, 

10 Massachusetts, #3412). Briefly, 100 ng to 10 jag of a TM (e.g., fluorescein labeled) 
may be added to 1.5 mL of assay buffer in the basolateral compartment of cell 
monolayers and incubated at a temperature that allows binding and internalization of 
TMs, but that inhibits transcytosis (e.g., 90 minutes at 16°C). The medium from both 
compartments is then removed and the filter membranes washed (e.g., twice at 4°C with 

15 PBS). The membrane is immersed in a fixation solution of, for example, 3% (w/v) 
paraformaldehyde, 1% (w/v) glutaraldehyde, 5% (w/v) sucrose, 100 mM Na phosphate 
pH 7.4 on ice for 30 minutes. The membranes may be removed from the plastic insert 
by cutting around the periphery with a scalpel and cut into 5 mm square sections. These 
wholemount sections may be placed on microscope slides and observed microscopically 

20 under epifluorescence illumination (excitation 490 nm, emission 520 nm) or by 
fluorescence confocal microscopy. Internalized TM is indicated by the presence of 
bright green-yellow fluorescence in intracellular vesicles. 

Substitutions and modifications that result in a variant that retains the 
qualitative binding specificity for a basolateral factor (i.e., a 3 to 1 or greater differential 

25 in signal intensity between basolateral and non-basolateral domains) are considered to 
be conservative. Preferred conservative substitutions and modifications include 
alterations in a sequence that render it, at least in part, consistent with the J chains of 
one or more other species. A TM may also, or alternatively, contain other sequences 
that confer properties not present in a native J chain. Other preferred modifications 

30 include the addition of one or more protein domains at the N- and/or C-terminus and/or 
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altering the order of domains present within a native J chain sequence. A variant may 
contain any combination of such substitution(s) and/or modification(s), provided that 
the ability of the variant to specifically bind to an epithelial basolateral factor and cause 
internalization of the linked biological agent is not substantially reduced. 
5 A native J chain typically has 6 domains. The first (N-terminal) domain 

is a short linear (i.e., as contrasted to a loop) peptide that serves (in vivo) as the junction 
between the signal peptide and the core TM molecule. Domain 1 typically contains 1- 
20 amino acid residues, and the first amino acid is generally D, E or Q. In Figure 1, 
Domain 1 contains the amino acids up to and including residue number 1 1 . Domain 1 

10 is not essential for TM function, and variants that do not contain this domain are within 
the scope of the present invention. 

Domain 2 typically contains 90 amino acids, and possesses substantial p- 
sheet character. This P-sheet region contains peptides of varying length lacking P- 
strand character (e.g., residues 26-31, 49-53), the peptides usually containing polar 

15 and/or charged amino acids. In a TM, Domain 2 is a covalently closed peptide loop, 
called the core, which is typically formed by an intramolecular cystine composed of the 
initial and ultimate residues of Domain 2 (residues 12 and 101 of Figure 1). Within 
Domain 2, there may be another cystine bond that defines Domain 3, a peptide loop that 
is nested within the core. It has been found, within the context of the present invention, 

20 that the core (with or without Domain 3) is sufficient to provide TM function. 
Accordingly, a preferred TM contains Domain 2 (i.e., residues 12-70 and 92-101 of 
Figure 1), or a portion or variant thereof that substantially retains TM function. 

Within Domain 2, the second cysteine is generally separated from the 
initial cysteine of Domain 2 by a single amino acid residue (see, for instance, Figure 1). 

25 Between the second and third cysteines of Domain 2 is a region of primarily p-sheet 
character. These two cysteines (2 and 3) when present, typically do not form cystines 
within the core. The fourth cysteine is typically separated from the third cysteine by 
two basic amino acid residues and initiates Domain 3. Domain 3 ends with the fifth 
cysteine which is oxidized by the fourth cysteine. The resulting cystine forms a 

30 covalent peptide loop defining Domain 3 contained completely within Domain 2. 
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Cysteine 6 is the ultimate residue of Domain 2, and is oxidized to cystine by the initial 
residue of Domain 2. 

Within the core is a canonical peptide sequence for N-linked 
glycosylation {e.g., NIS). When produced by eukaryotic cells, carbohydrate moieties 
5 can be covalently attached to an N residue of a TM at this site. 

When present, Domain 3 is typically a peptide 21 amino acids in length. 
This domain is delimited by amino and carboxy terminal cysteine residues which form 
an intramolecular cystine bond that is contained completely within the core. 

Domains 4-6 are carboxy terminal domains in native J chains which 
10 may, but need not, be present within a TM. Domain 4 is typically a peptide of seven 
amino acids. In native J chains, this peptide contains no cysteine residues and connects 
the core to Domain 5. Domain 5 is, when present, typically a peptide of 26 amino acids 
delimited by amino and carboxy terminal cysteine residues which form an 
intramolecular cystine bond resulting in a covalently closed loop. In native J chains, the 
15 amino and carboxy terminal portions of Domain 5 have substantial P-sheet character 
and are separated by a short 3-6 residue peptide with low p-sheet propensity. Domain 6 
is typically a short peptide of five amino acids or less which serves as the carboxy 
terminus of a TM. Domains 4-6 are not essential for TM function. 

As noted above, numerous variants of native J chain sequences may be 
20 employed within TMs as described herein. For example, a TM core, as described 
above, can serve as a molecular scaffolding for the attachment and/or substitution of 
Domains and/or additional molecular components. Possible variants include: 

• TMs in which Domain 1 comprises a peptide of about 13 amino acids, 
the middle third of which has substantial P-sheet character {e.g., DQEDERIVLVDNK; 

25 SEQIDNO:37); 

• TMs in which the asparagine residue at position 48 is changed to 
histidine {e.g., AAT to CAC); 

• TMs in which Domain 1 comprises a three amino acid peptide DNK; 

• TMs in which Domain 1 contains a peptide with a sequence specific 
30 for recognition and cleavage by a protease which can be used to release distal portion of 
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the TM from a proximal colinear peptide or protein (e.g., a peptide recognized by the 
tobacco etch virus protease Nia: ENLYFQS; SEQ ID NO:38); 

• TMs in which Domain 1 contains a peptide sequence which specifies 
the intracellular targeting of the contiguous peptide (e.g., a nuclear targeting peptide); 

5 • TMs in which one or both of the native cysteine residues 2 or 3 within 

Domain 2 are removed or replaced to eliminate the possibility of intermolecular 
crosslinking (e.g., substitutions of S, T, A, V or M residues for the native C); 

• TMs in which a portion of Domain 3 is deleted, such that there is a 
peptide bond between the amino acid distal to the end of the third (3-sheet of Domain 3 

10 and the initial residue of the ultimate peptide of Domain 3; 

• TMs in which other peptides that form loop structures or other 
antiparallel peptide domains are included in place of Domain 3, or between its defining 
cysteines, to provide functionalities or recognition domains to the TM (e.g., viral capsid 
protein loops); 

15 • TMs in which Domain 4 is truncated to form a TM without Domains 5 

and 6; 

• TMs in which Domain 4 is replaced as described above for Domain 3 
to introduce a new functionality, specificity and/or structure to the TM; 

• TMs in which Domain 4 contains a proteolytic site specific for a 
20 cellular compartment which would result in cleavage of the TM into two molecules in a 

cellular compartment; 

• TMs in which the loop structure of Domain 5 is replaced with a peptide 
sequence to provide functionalities or recognition domains to the TM (e.g., single chain 
antibody variable region or viral capsid protein loop); 

25 • TMs in which Domain 6 is terminated in a peptide sequence or is 

replaced with a peptide sequence that would target the contiguous TM protein to an 
intracellular target (e.g., KDEL, SEQ ID NO:44, or HDEL, SEQ ID NO: 102, for 
retention in the endomembrane system); 

• TMs that additionally comprise one or more immunoglobulin-derived 
30 sequences (e.g., domains of the Ig heavy chain classes: alpha3, alpha2, alphal, mu4, 
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mu3 ? mu2, mul) linked via one or more disulfide and/or peptide bonds. Such 
sequences may serve as attachment sites for one or more biological agents. 

The above list of representative variants is provided solely for illustrative 
purposes. Those of ordinary skill in the art will recognize that the modifications recited 
5 above may be combined within a single TM and that many other variants may be 
employed in the context of the present invention. 

TMs may generally be prepared using any of a variety of well known 
purification, chemical and/or recombinant methods. Naturally-occurring TMs (e.g., 
human J chain) may be purified from suitable biological materials, as described herein. 

10 All or part of a TM can be synthesized in living cells, with the sequence and content 
defined by the universal genetic code, a subset of the genetic code or a modified genetic 
code specific for the living cells. Any of a variety of expression vectors known to those 
of ordinary skill in the art may be employed to achieve expression in any appropriate 
host cell. Suitable host cells include insect cells, yeast cells, mammalian cells, plant 

15 cells, algae, bacteria and other animal cells (e.g., hybridoma, CHO, myeloma). 

An example of a synthetic gene encoding a targeting molecule is 
provided in SEQ ID NO:7. Such synthetic genes may be ligated into, for example, a 
polyhedrin-based baculovirus transfer vector such as pMelBac A, pMelBac B or 
pMelBac C (Invitrogen, San Diego, California) between suitable restriction sites (e.g., 

20 the BamHI and Sail sites) and introduced into insect cells such as High Five, Sf9 or 
Sf21 in a cotransfection event using Bac-N-Blu AcMNPV DNA (Invitrogen, San 
Diego, California) according to standard methods. Other suitable vectors and host cells 
will be readily apparent to those of ordinary skill in the art. 

Synthetic polypeptide TMs or portions thereof having fewer than about 

25 100 amino acids, and generally fewer than about 50 amino acids, may be generated 
using synthetic techniques well known to those of ordinary skill in the art. For 
example, such polypeptides may be synthesized using any of the commercially 
available solid-phase techniques, such as the Merrifield solid-phase synthesis method, 
where amino acids are sequentially added to a growing amino acid chain. See 

30 Merrifield, J. Am. Chem. Soc. 55:2149-2146, 1963. Equipment for automated synthesis 
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of polypeptides is readily available from suppliers such as Applied BioSystems, Inc., 
Foster City, California, and may be operated according to the manufacturer's 
instructions. 

In addition to the TMs described above, there are other molecules which 
5 may bind specifically to a basolateral factor associated with an epithelial cell and 
subsequently result in internalization into epithelial cells followed by transcytosis 
through the epithelial barrier. Such molecules include peptides or proteins containing 
antibody domains which bind to the polyimmunoglobulin receptor. This type of 
molecule may be identified in screening assays employing epithelium-like surfaces in 
10 culture. 

Within one suitable screening assay, a combinatorial library of peptides 
is employed, each peptide of which contains an easily identifiable biochemical or 
chemical marker such as a biotinyl-lysine residue, or a tyrosine residue modified by 
covalent linkage to radiolabeled iodine. In such an assay, individual peptides or 

1 5 families of peptides with 8 to 1 5 amino acid residues are incubated in solutions exposed 
to the basolateral surface of an epithelium-like monolayer cell culture. After incubation 
of the peptide solution, the solution on the apical surface of the cell culture is assayed 
for the presence of transported peptides by analysis for the biochemical or chemical 
marker included during synthesis. Subsequent analysis of the peptide sequence of the 

20 transported peptide, for instance by mass spectrometry, is used to reveal the identity of a 
peptide which can be transported across an epithelium-like surfaces in culture. Any 
peptide identified in this manner is then synthesized by chemical means to contain a 
fluorescent marker. The peptide containing a fluorescent marker is then incubated in 
solutions exposed to the basolateral surface of an epithelium-like monolayer cell culture 

25 under conditions which allow binding, but not internalization {e.g., 4°C) or under 
conditions which allow uptake but not transcytosis {e.g., 16°C) and the cells observed 
microscopically to determine the ability the peptides to bind or to be internalized by the 
cells of an epithelium-like layer. 

A similar assay can be used to screen populations of monoclonal 

30 antibodies, single chain antibodies, antibody combining regions, or Fab fragments for 
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the ability to bind to, be internalized and transcytosed by epithelial cells containing the 
polyimmunoglobulin receptor. Antibodies raised in animals immunized with secretory 
component, with the polyimmunoglobulin receptor, or animals naive to such 
immunization are incubated in solutions exposed to the basolateral surface of an 
5 epithelium-like monolayer cell culture. After incubation of antibodies, the solution on 
the apical surface of the cell culture is assayed for the presence of transported antibodies 
by analysis for the presence of antibody or antibody fragment. This evaluation can be 
performed using commercially available antibodies for enzyme linked immunosorbent 
assays, or by immunoblotting techniques. Either of these assays can be performed 

10 easily by one skilled in the art of characterizing antibodies. 

Any antibody or antibody fragment identified in this manner may then be 
isolated and conjugated to a fluorescent marker. The immunoglobulin thus attached to a 
fluorescent marker is then incubated in solutions exposed to the basolateral surface of 
an epithelium-like monolayer cell culture under conditions which allow binding, but not 

15 internalization (e.g., 4°C) or under conditions which allow uptake but not transcytosis 
(e.g., 16°C) and the cells observed microscopically to determine the ability the 
antibodies to bind or to be internalized by the cells of an epithelium-like layer. Ferkol 
et al., J. Clin. Invest. 92:2394-2400 have identified an antibody binding domain by 
similar methods. 

20 Linkage of a TM to one or more biological agents may be achieved by 

any means known to those in the art, such as genetic fusion, covalent chemical 
attachment, noncovalent attachment (e.g., adsorption) or a combination of such means. 
Selection of a method for linking a TM to a biological agent will vary depending, in 
part, on the chemical nature of the agent and depending on whether the agent is to 

25 function at the basolateral surface, within the epithelial cell, or undergo transcytosis. 
Linkage by genetic fusion may be performed using standard recombinant DNA 
techniques to generate a nucleic acid molecule that encodes a single fusion peptide 
containing both the biological agent(s) and the TM. Optionally, the fusion peptide may 
contain one or more linker sequences and/or sequences for intracellular targeting (e.g., 

30 KDEL, protease cleavage sites, nuclear targeting sequences, etc.). The recombinant 
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nucleic acid molecule is then introduced into an appropriate vector and expressed in 
suitable host cells. Techniques for generating such a recombinant molecule and 
expressing a fusion peptide are well known to those of ordinary skill in the art {see, e.g., 
Sambrook et al., Molecular Cloning: A Laboratory Manual, 2d ed., Cold Spring Harbor 
5 Laboratory, Cold Spring Harbor, New York, 1989). Any biological agent having a 
known polypeptide sequence may be linked to a TM by genetic fusion. For example, 
using recombinant techniques, one or more immunoglobulin-derived sequences (e.g., 
single chain antigen binding proteins, hinge, Fv gamma or Fv kappa) may be linked to a 
TM at the N- and/or C-terminus. 

10 Linkage may also be achieved by covalent attachment, using any of a 

variety of appropriate methods. For example, the TM and biological agent(s) may be 
linked using Afunctional reagents (linkers) that are capable of reacting with both the 
TM and the biological agent(s) and forming a bridge between the two. Commonly 
available bifunctional cross-linkers are capable of joining carbohydrates, amines, 

15 sulfhydryls and carboxyl functional groups, or may employ photoreactive groups to 
enable covalent linkage. These reagents are particularly useful for the attachment of, 
for example, additional peptide linkers that are in turn attached to biological agents. 
Covalent attachment of linkers may be accomplished through bonding to amino acid 
side chains present in the antigen combining site of an antibody linked to a TM. 

20 Briefly, attachment of linkers to such residues can occur as a result of the antibody 
recognition process itself when the linker is recognized as antigen and compatible 
reactive residues are present on the linker and in the binding domain of the antibody. 
Such reactive antibodies typically have antigen combining sites containing amino acid 
residues with side chains which can act as nucleophiles (e.g., aspartate, glutamate, 

25 glutamine, lysine and/or asparagine). For delivery of agents that will remain within the 
epithelial cell, linkers that are cleaved within the target cell may be particularly useful. 
Release of the biological agent within the cell may introduce or augment a genetic 
capability of the cell (e.g., increasing the P53 protein level in carcinoma cells) or may 
inhibit an existing cellular activity (e.g., antisense oligonucleotides may bind functional 
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intracellular transcripts that are essential for tumorigenesis, tumor maintenance and/or 
metastases, such as transcripts that generate high levels of glycolytic enzymes). 

Any of a variety of molecules may serve as linkers within the present 
invention. Polynucleotide and/or peptide linkers may be used. Such molecules may 
5 then be digested by, for example, intestinal nucleases and proteases (e.g., enterokinase, 
trypsin) respectively to release the biological agent. Preferred linkers include 
substrates for proteases associated with an epithelial barrier (i.e., proteases resident in, 
on or adjacent to epithelial cells or surfaces). 

Numerous proteases are present in or associated with epithelial cells 

10 and/or epithelial surfaces. Processing of secreted proteins, for example, requires 
proteolytic scission of a portion of the newly synthesized protein (referred to as the pre- 
protein) prior to secretion from the cellular endomembrane system. Further processing, 
which may be required to liberate an active enzyme from the cell, for example, can 
result from additional proteolysis wherein the substrate may be referred to as the pro- 

15 protein or pro-enzyme. The specific proteolytic cleavage sites of these pro-proteins can 
be identified by comparison of the amino acid sequence of the final secreted protein 
with the sequence of the newly synthesized protein. These cleavage sites identify the 
substrate recognition sequences of particular intracellular proteases. One such protease 
recognition site, specific to epithelial cells, may reside within the amino acid sequence 

20 from residues 585-600 of the human polyimmunoglobulin receptor (plgR, SEQ ID 
NO:45; numbering according to Piskurich et aL, J. Immunol 154:1735-1747, 1995). 
Alternatively, the intracellular scission of plgR may be contained within residues 601- 
630 (VRDQAQENRASGDAGSADGQSRSSSSKVLF, SEQ ID NO:lll). Subsequent 
shortening of SC from the carboxy terminus to yield mature SC may occur due to a 

25 carboxypeptidase in the mucosal environment. Peptides comprising all or part of the 
sequence from residue 601 to 630 may be useful for endosomal release of transcytosing 
TM-drug conjugates. Another such protease recognition site, which identifies a peptide 
substrate for many matrix metalloproteinases (MMPs) comprises the amino acid 
sequence PLGIIGG (SEQ ID NO: 109). Since cancer cells often contain and secrete 

30 abundant quantities of MMPs this sequence may be efficiently cleaved specifically in 
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and around cancer cells. Since cancer cells secrete abundant quantities of proteases, the 
intracellular proteases which are responsible for their processing are also in abundance. 
One such protease recognition site, which identifies a protease which also may be 
abundant in cancer cells, comprises residues 30-40 of procathepsin E (SEQ ID NO:39). 
5 Another type of protease recognition sequence comprises residues in the CH2 region of 
human IgAl (VPSTPPTPSPSTPPTPSPSCCHPRL, SEQ ID NO: 112) and is cleavable 
by IgA specific proteases secreted by microorganisms. 

These protease recognition sites are extremely useful in the design of 
scissile linkers enabling the delivery of drugs, imaging compounds, or other biological 

10 agents to the intracellular environment of epithelial cells or to the epithelial barrier in 
general. Delivery of such compounds to epithelial cells can be accomplished by using 
residues 585-600 of human plgR (SEQ ID NO:45) or residues 601-630 (SEQ ID 
NO:l 1 1) as part of the scissile linker joining the biological compound to TM. Delivery 
of anti-cancer drugs to tumors of epithelial origin can be accomplished using a substrate 

15 recognition sequence of MMPs (SEQ ID NO: 109) or residues 30-40 of procathepsin E 
(SEQ ID NO: 3 9) as part of the scissile linker to TM. Alternatively, scissile linkers may 
be designed from other cancer cell specific or epithelial barrier specific processing 
proteases which may be identified by the comparison of newly synthesized and secreted 
proteins or similar techniques. Other types of proteases that can be used to cleave 

20 scissile bonds can be found in the mammalian duodenum, for example. The 
enterokinase recognition sequence, (Asp) 4 -lys, can be used as a scissile linker for 
delivery of biological compounds to the duodenum by TM mediated transcytosis across 
the duodenum epithelial barrier. Proteolytic cleavage releases the biological agent with 
a small fragment of linker (e.g., VQYT, SEQ ID NO:40, from procathepsin; EKVAD, 

25 SEQ ID NO:41, from plgR; or IIGG, SEQ ID NO:l 10 from the general MMP substrate 
sequence). Such residual linker segments may in turn be further digested by proteolytic 
enzymes (e.g., carboxypeptidase II or aminopeptidase I) to yield an unmodified 
biological agent. 

Scissile peptide linkers are generally from about 5 to about 50 amino 
30 acid residues in length. They can be covalently linked to TM or to adducts attached to 
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TM by genetic fusion techniques (i.e., in frame with the 5' or 3' sequence of TM 
codons or adduct codons) or by any of a variety of chemical procedures enabling the 
joining of various functional groups (e.g., NH 2 , COOH, SH). Alternatively the scissile 
peptide can itself comprise an antigen which may then be bound to TMs containing a 
5 cognate antigen binding capability. For example a scissile peptide comprising the 
sequence -Glu-Gln-Lys-Leu-Ile-Ser-Glu- Asp-Leu- (SEQ ID NO: 113) will be 
recognized and bound by an anti-myc antibody (e.g., Cat. No. R950-25, Invitrogen, 
Carlsbad, California). Similarly, a scissile peptide containing an oligohistidine at its 
carboxy terminus will be recognized and bound by an anti-His(C-term) antibody (e.g., 

10 Cat. No. R930-25, Invitrogen, Carlsbad, California). 

Other substrates for intracellular proteases associated with an epithelial 
barrier include, but are not limited to, substrates for a phospholipase or glycosidase. 
Alternatively, a linker may comprise repeating positively charged lysine residues that 
will bind negatively charged nucleic acid molecules for release in the cell. Peptide 

15 linkers may be particularly useful for peptide biological agents, such as the antibiotic 
cecropins, magainins and mastoparins. 

Carbohydrates may be covalently attached to native carbohydrate or to 
the polypeptide backbone of a TM, and employed as linkers. Suitable carbohydrates 
include, but are not limited to, lactose (which may be degraded by a lactase residing in, 

20 for example, the small intestine), sucrose (digested by a sucrase) and a-limit dextrin 
(digested by a dextrinase). Enzymes responsible for cleaving carbohydrate linkers can 
be found attached to the brush border membranes of the luminal surface of the epithelial 
barrier. Sucrase-isomaltase, for example, will cleave 1,4-a bonds of maltose, 
maltotriose and maltopentose. An intestinal brush border specific linker would 

25 therefore be comprised of any polymer of maltose linked by 1,4-a bonds. When 
attached to TM, the linker would pass through the epithelial barrier by transcytosis and 
would only be cleaved by sucrase-isomaltase resident on the apical surface of the 
epithelial barrier. 

Lipids may also, or alternatively, be covalently attached to the 
30 polypeptide backbone for use as linkers. A monoglyceride employed in this manner 
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may then be digested by intestinal lipase to release a biological agent linked to glycerol 
or a fatty acid. Phospholipids may be attached to a TM via a peptide linkage to the 
phosphatidylserine polar head group or by an ether or ester linkage to one of the 
hydroxyl groups of the head group of phosphatidyl inositol. The non-polar head group 
5 (diacylglycerol) may be substituted entirely by the biological agent in active or inactive 
form. For example, a penicillin linked via its R group to the phosphate of 1-phospho- 
mj/o-inositol-TM will be inactive until released by a phospholipase C derived from a 
bacterial infection. Other suitable linker moieties will be apparent to those of ordinary 
skill in the art. 

10 Linkage may also be performed by forming a covalent bond directly 

between a TM and a biological agent. Regardless of whether a linker is employed, any 
of a variety of standard methods may be used to form a covalent linkage. For peptide 
biological agents and linkers, such a covalent bond may be a disulfide bond between 
cysteine residues of the TM and biological agent. Briefly, such bonds may be formed 

15 during the process of secretion from the endomembrane system of higher organisms. In 
such cases, the peptide biological agent(s) and TM must contain appropriate signals 
specifying synthesis on endomembranes. Such signals are well known to those of 
ordinary skill in the art. Reactive antibodies may covalently attach directly to a 
biological agent or a linker. Antibodies raised against antigens containing reactive 

20 groups or transition state analogs for specific reactions may contain residues in the 
combining site capable of forming covalent interactions with the antigen or with similar 
molecules. An example of such a reaction occurs between a lysine residue in the 
combining site of the monoclonal antibody 38C2 which reacts to form a vinylogous 
amide linkage with diketone and other closely related molecules (Wagner et al., Science 

25 270:1797-1800, 1995). A TM containing a reactive antibody or the combining site of a 
reactive antibody can be used to form covalent bonds with linkers of lipid, peptide, 
carbohydrate, nucleic acid or other compositions. TMs containing biological agents 
attached to TM via covalent bonds in the combining site can be expected to have 
normal conformations and functions in the antibody domain. The absence of 

30 modifications to antibody structure outside the antigen combining site minimize the 
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potential for altering the recognition of such molecules as foreign when introduced into 
the body. Further, the molecules tethered through combining sites of antibodies of 
human origin are expected to have half-lives in serum and other body compartments 
similar to those of native antibodies and have a low propensity to stimulate antibody 
5 responses against the TM. 

As noted above, any therapeutic biological agent may be linked to a TM. 
Biological agents include, but are not limited to, proteins, peptides and amino acids; 
nucleic acids and polynucleotides; steroids; vitamins; polysaccharides; minerals; fats; 
inorganic compounds and cells or cell components. A biological agent may also be a 

10 prodrug that generates an agent having a biological activity in vivo. In general, 
biological agents may be attached using a variety of techniques as described above, and 
may be present in any orientation. 

The category of peptide biological agents includes a variety of binding 
agents. As used herein, a "binding agent" is any compound that binds to a molecule 

15 within the cell and inactivates and/or facilitates removal of the molecule. Binding 
agents include single chain antigen binding proteins, which may be used, for example, 
to inhibit viral pathogen assembly by binding essential components inside the cell and 
subsequently transcytosing components across the apical boundary; to bind and remove 
bacterial toxins by transcytosis; to bind and remove serum or cellular toxins or 

20 metabolites; or to bind and remove environmental toxins. 

A binding agent may also be an antigen combining site such as, but not 
limited to, a reactive antigen combining site. For example, an antigen combining site 
may bind to an enzyme {e.g., an active site), and inhibit an activity of the enzyme. An 
antigen combining site may also bind to other molecules and inhibit other cellular 

25 functions such as, for example, a ribosome or transporter. 

Enzymes may also be employed, including kinases, transferases, 
hydrolases, isomerases, proteases, ligases and oxidoreductases such as esterases, 
phosphatases, glycosidases and peptidases. For example, an enzyme linked to a TM 
could result in specific proteolytic cleavage of bacterial toxins, attachment proteins or 

30 essential cell surface functions (viral or bacterial), proteolytic cleavage of secreted 
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cancer cell specific proteins (such as proteases) that are essential for tumor maintenance 
or metastases, degradation of cell surface carbohydrates essential to pathogenicity of 
viruses or bacteria or specific transfer of biochemical functions (such as 
phosphorylation) to inhibit extracellular cancer cell specific or pathogen specific 
5 functions. 

Peptide biological agents may also be enzyme inhibitors (e.g., leupeptin, 
chymostatin or pepstatin); hormones (e.g., insulin, proinsulin, glucagon, parathyroid 
hormone, colony stimulating factor, growth hormone, thyroid hormone, erythropoetin, 
follicle stimulating hormone, luteinizing hormone, tumor necrosis factors); hormone 

10 releasing hormones (e.g., growth hormone releasing hormone, corticotropin releasing 
factor, luteinizing hormone releasing hormone, growth hormone release inhibiting 
hormone (somatostatin), chorionic gonadotropin releasing factor and thyroid releasing 
hormone); cell receptors (e.g., hormone receptors such as estrogen receptor) and cell 
receptor subunits; growth factors (e.g., tumor angiogenesis factor, epidermal growth 

15 factor, nerve growth factor, insulin-like growth factor); cytokines (e.g, interferons and 
interleukins); histocompatibility antigens; cell adhesion molecules; neuropeptides; 
neurotransmitters such as acetylcholine; lipoproteins such as alpha-lipoprotein; 
proteoglycans such as hyaluronic acid; glycoproteins such as gonadotropin hormone; 
antibodies (polyclonal, monoclonal or fragment); as well as analogs and chemically 

20 modified derivatives of any of the above. 

Polynucleotide biological agents include antisense oligonucleotides 
(DNA or RNA) such as HIV, EBV EBNa-1 or reverse transcriptase antisense 
nucleotides; polynucleotides directed against active oncogenes or viral-specific gene 
products and polynucleotides complementary to unique sequences in the autoimmune 

25 B-cell immunoglobulin genes or T-cell receptor genes, or to mutant protein alleles (e.g., 
the mutant (3-amyloid protein); and polynucleotides encoding proteins (e.g., DNA 
within expression vectors or RNA) including drug resistance genes. Also included are 
polynucleotide agents with catalytic activities (e.g., ribozymes) or with the ability to 
covalently bind to cellular or viral DNA, RNA or proteins. Nucleotides (e.g., thymine) 
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and radionuclides (e.g., iodine, bromine, lead, palladium, copper) may also be 
employed. 

A wide variety of steroid biological agents may be employed, including 
progesterone, androgens and estrogens (including contraceptives such as ethinyl 
5 estradiol). Similarly, agents such as vitamins (including fat soluble vitamins such as 
vitamins A, D, E and K and analogs thereof) may be linked to a TM. Inorganic 
biological agents include oxides, such as iron oxide. Polysaccharide biological agents 
include any of a variety of carbohydrates, as well as lipopolysaccharides and 
compounds such as heparin. 

10 Biological agents linked to TMs may have any of a wide variety of 

activities in vivo. For example, a biological agent may be an antiviral agent {e.g., a 
nucleotide or nucleoside analog, such as Ara-AMP, DDA or AZT, an antiviral antibody 
or other agent such as rifampicin and acylovir), an antibacterial agent (e.g., penicillin, 
sulfanilamides, cecropins, magainins, mastoparans, actinomycin, gramicidin, 

1 5 aminoglycosides such as gentamycin, streptomycin and kanamycin; bleomycins such as 
bleomycin A 2 , doxorubicin, daunomycin and antisense nucleotides complementary to 
the y terminus of prokaryotic 16S rRNA), an antifungal agent (e.g., azoles such as 
fluconazole, polyene macrolides such as aminoptericin B and candicidin), an 
antiparasitic agent (e.g., antimonials or antisense nucleotides complementary to a 

20 conserved sequence of the haem polymerase gene of Plasmodium falciparum or to a 
nucleotide leader sequence common to parasites such as trypanosomes) or an antitumor 
agent (e.g., 5-fluorouracil, methotrexate and intercalating agents such as cis- 
diaminodichloroplatimun). 

A biological agent may also be a chemoprotective agent (e.g., N-acetyl- 

25 L-cysteine, folinic acid); a radioprotective agent (e.g., WR 2721, selenium, melanins, 
cysteamine derivatives, phenolic functional groups such as 6-hydroxy-chroman-2 
carobxylic acids (e.g., Trolox) and tocopherols) or a cytotoxic agent (e.g., nitrogen 
mustard agents such as L-phenylalanine nitrogen mustard or cyclophosphamide, 
antifolates, vinca alkaloids, anthracyclines, mitomycins, cytotoxic nucleosides, the 

3 0 pterine family of drugs, podophy ophy llotoxins, sulf onureas, trichothecenes and 
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colchicines; specific cytotoxic agents include aminopterin, taxol, doxorubicin, 
fostreicin, camptpthecin methopterin, dichloromethotrexate, mitomycin C, 
porfirmoycin, 6-mercaptopurine, cytosine arabinoside, podophyllotoxin, etoposide, 
melphalan, vinblastine, vincristine, desacetylvinblastine hydrazide, leurosidine, 
5 vindesine, leurosine, trichothecene, desacetylcolchicine, paclitaxel, carminomycin, 4'- 
epiadriamycin, 4-demethoxy-daunomycin, 1 1 -deoxydaunorubicin, 13- 

deoxydaunorubicin, adriamycin-14-benzoate, adriamycin-14-octanoate, adriamycin-14- 
naphthaleneacetate, N-methyl mitomycin C, dideazatetrahydrofolic acid, cholchicine 
and cisplatin), 

10 In other embodiments, a biological agent may be an immunomodulating 

agent or vaccine; an antihistamine (e.g., diphenylhydramine, chlorpheniramine); a drug 
that affects the cardiovascular, renal or hepatic system; a sympathomimetic drug such as 
catecholamines (e.g., epinephrine) and non-catecholamines (e.g., phenylephrine and 
pseudoephedrine); a hormone antagonist; a toxin such as diphtheria toxin, ricin, abrin, 

1 5 pseudomonal aeruginosa endotoxin A, ribosomal inactivating proteins, mycotoxins such 
as trichothecenes and gelonin; a vasoactive agent; an anticoagulant; an anesthetic or 
sedative (e.g., dibucane); a decongestant; or a pain reliever (e.g., narcotic). 

A biological agent may also be a neuroactive agent, including 
neuroleptics such as phenothiazines (e. g. , compazine, thorazine, promazine, 

20 chlorpromazine, acepromazine, aminopromazine, perazine, prochlorperazine, 
trifluoperazine and thioproperazine); rauwolfia alkaloids (e.g., reserpine and deserpine); 
thioxanthenes (e.g., chlorprothixene); butyrophenones (e.g., haloperidol, moperone, 
trifluperidol, timiperone and droperidol); diphenylbutylpiperidines (e.g., pimozide); 
benzamides (e.g., sulpride and tiapride); tranquilizers such as glycerol derivatives (e.g., 

25 mephenesin and methocarbamol), propanediols (e.g., meprobamate), diphenylmethane 
derivatives (e.g., orphenadrine, benzotrapine and hydroxyzine) and benzodiazepines 
(e.g., chlordiazepoxide and diazepam); hypnotics (e.g., zolpdem and butoctamide); 
betablockers (e.g., propranolol, acebutonol, metoprolol and pindolol); antidepressants 
such as dibenzazepines (e.g., imipramine), dibenzocycloheptenes (e.g., amitriptyline) 

30 and tetracyclics (e.g., mianserine); MAO inhibitors (e.g., phenelzinem iproniazid and 
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selegeline); psychostimulants such as phenylethylamine derivatives (e.g., 
amphetamines, dexamphetamines, fenproporex, phentermine, amfepramone and 
pemoline) and dimethylaminoethanlos (e.g., clofenciclan, cyprodenate, aminorex and 
mazindol) ; G AB A-mimetics (e. g. , progabide) ; alkaloids (e. g. , codergocrine, 
5 dihydroergocristine and vincamine); cholinergics (e.g., citicoline and physostigmine); 
vasodilators (e.g., pentoxifyline); or cerebroactive agents (e.g., pyritinol and 
meclofenoxate). 

Table 1 below provides some examples of representative combinations 
of TM (with or without immunoglobulin-derived sequence(s)) and biological agent(s). 
10 In some cases, linkers are also indicated. For such combinations, intracellular delivery 
may be achieved using appropriate scissile linkers. Alternatively, other intracellular 
targeting sequences (e.g., KDEL) may be incorporated. In the absence of sequences 
that direct the TM intracellularly, the TMs provided in Table 1 deliver the biological 
agent(s) via transcytosis. Multiple orientations for all TM attachments are possible. 
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Table I 

Representative Targeting Molecule/Biological 
Agent Combinations 



Corrchiff ati on 


Van ati on s/ Comments 


Genetic Fusions 




TM-scabp 




scabp-TM 




scabp-TM-scabp 




TM/alpha3-scabp(s) 


Either or both ligands N or C 


TM/alpha3,2-scabp(s) 


M 


TM/alpha3,2, 1 -scabp(s) 


M 


TM/mu4-scabp(s) 


M 


TM/mu4,3-scabp(s) 


n 


TM/mu4,3 ? 2-scabp(s) 


n 


TM/mu4 ? 3 ,2, 1 -scabp(s) 


M 


TM-Fv 


gamma or kappa Fv; associated with comple- 




mentary Fv to form antigen binding site, Fab 


Fv-TM 




Fv-TM-Fv 




TM/alpha3-Fv(s) 


Either or both ligands N or C 


TM/alpha3,2-Fv(s) 


»t 


TM/alpha3,2,l-Fv(s) 


M 


TM/mu4-Fv(s) 


M 


TM/mu4 ? 3-Fv(s) 


»t 


TM/mu4,3,2-Fv(s) 


It 


TM/mu4,3,2,l-Fv(s) 


ft 


TM-hinge-Fv 


gamma or kappa hinge-Fv; associated with 




complementary Fv-hinge to form antigen 




binding site, Fab 


Fv-hinge-TM-hinge-Fv 




TM/alpha3,2-hinge-Fv(s) 


Either or both ligands N or C 


TM/alpha3 ,2, 1 -hinge-Fv(s) 


Tt 


TM/mu4-hinge-Fv(s) 


tl 


TM/mu4, 3 -hinge-F v(s) 


M 


TM/mu4,3,2-hinge-Fv(s) 


It 


TM/mu4,3,2 9 1 -hinge-Fv(s) 


tt 


TM-Enz 




Enz-TM 




Enz-TM-Enz 




TM/alpha3 -Enz(s) 


Either or both ligands N or C 
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Combination 


Variations/ Comments 


TM/alpha3,2-Enz(s) 


« 


TM/alpha3,2,l-Enz(s) 




TM/mu4-Enz(s) 




TM/mu4 5 3-Enz(s) 




TM/mu4,3,2-Enz(s) 




TM/mu4,3,2,l-Enz(s) 




Chemical 




Modifications 




TM-carbo 




carbo-TM 




carbo-TM-carbo 




TM/ligand-carbo(s) 




TM-lipid 




lipid-TM 




lipid-TM-lipid 




TM/ligand-lipid(s) 




TM-nucleic acid 




nucleic acid-TM 




nucleic acid-TM-nucleic acid 




TM/ligand-nucleic acid(s) 




TM-peptide 




peptide-TM 




peptide-TM-peptide 




TM/ligand-peptide(s) 




TM-nucleic acid/anti viral 




antiviral/nucleic acid-TM 




antiviral/nucleic acid-TM-nucleic 




acid/anti viral 




TM/ligand-nucleic acid/antiviral(s) 




TM-lipid-antibiotic 




antibiotic-lipid-TM 




antibiotic-lipid-TM-lipid-antibiotic 




TM/ligand-lipid-antibiotic(s) 




TM-peptide-antibiotic 




antibiotic-peptide-TM 




antibiotic-peptide-TM-peptide 




antibiotic 




TM/ligand-peptide-antibiotic(s) 





TM = targeting molecule; scabp = single chain antigen binding protein; enz = enzyme; 
carbo = carbohydrate; ligand = immunoglobulin-derived sequence (alpha3, alpha2 
and/or alphal; mu4, mu3, mu2 and/or mul); N=NH 2 terminal; C=COOH terminal 
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Of course, the above examples of biological agents are provided solely 
for illustrative purposes and are not intended to limit the scope of the invention. Other 
agents that may be employed within the context of the present invention will be 
apparent to those having ordinary skill in the art. 
5 In one embodiment, a targeting molecule as described above is linked to 

a biological agent that is not naturally associated with the targeting molecule. Within 
the context of this embodiment, the biological agent is not iodine. The biological agent 
may, for example, be an enzyme, binding agent inhibitor, nucleic acid, carbohydrate or 
lipid. In one preferred embodiment the biological agent comprises an antigen 

1 0 combining site. 

TMs linked to one or more biological agents may be used for a variety of 
therapeutic purposes. In general, such TMs may be employed whenever it is 
advantageous to deliver a biological agent to epithelial tissue (for internalization and/or 
transcytosis). For example, a variety of conditions associated with an epithelial surface 

15 (i.e., conditions where an infectious agent gains access to the body through an epithelial 
surface; where an infection agent is resident in or on epithelial cells or surfaces; where 
epithelial barriers are compromised due to a disease condition or where epithelial tissue 
or cells are dysfunctional, transformed or the focus of an inflammatory response) may 
be treated and/or prevented using biological agents linked to TMs. Such conditions 

20 include, but are not limited to, cancer, viral infection, inflammatory disorders, 
autoimmune disorders, asthma, celiac disease, colitis, pneumonia, cystic fibrosis, 
bacterial infection, mycobacterial infection and fungal infection (such as yeast 
infection). Appropriate biological agents will vary depending on the nature of the 
condition to be treated and/or prevented and include those provided above, as well as 

25 others known to those of ordinary skill in the art. 

As used herein, "treatment" refers to a lessening of symptoms or a delay 
in, or cessation of, the progression of the condition. A biological agent linked to a TM 
is generally administered to a patient afflicted with the condition in the form of a 
pharmaceutical composition, at a therapeutically effective dosage. To prepare a 

30 pharmaceutical composition, an effective concentration of one or more TM-biological 
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agent complexes is mixed with a suitable pharmaceutical carrier or vehicle. 
Alternatively, a pharmaceutical composition may contain cells from the host or from 
another organism (e.g., a myeloma cell, stem cell, dendritic cell, hepatocyte or basal 
cell) which, when introduced into the body of the host, produce a TM. An amount of a 
5 TM (or cells that produce a TM in vivo) that, upon administration, ameliorates the 
symptoms or treats the disease is considered effective. Therapeutically effective 
concentrations and amounts may be determined empirically by testing the TMs in 
known in vitro and in vivo systems; dosages for humans or other animals may then be 
extrapolated therefrom. Pharmaceutical carriers or vehicles include any such carriers 

1 0 known to those skilled in the art to be suitable for the particular mode of administration. 

The compositions of the present invention may be prepared for 
administration by a variety of different routes, including orally, parenterally, 
intravenously, intradermally, subcutaneously or topically, in liquid, semi-liquid or solid 
form and are formulated in a manner suitable for each route of administration. 

1 5 Preferred modes of administration depend upon the indication treated. 

Solutions or suspensions used for oral, parenteral, intradermal, 
subcutaneous or topical application can include one or more of the following 
components: a sterile diluent, saline solution (e.g., phosphate buffered saline), fixed oil, 
polyethylene glycol, glycerin, propylene glycol or other synthetic solvent; antimicrobial 

20 agents, such as benzyl alcohol and methyl parabens; antioxidants, such as ascorbic acid 
and sodium bisulfite; chelating agents, such as ethylenediaminetetraacetic acid (EDTA); 
buffers, such as acetates, citrates and phosphates; and agents for the adjustment of 
toxicity such as sodium chloride or dextrose. In addition, other pharmaceutically active 
ingredients and/or suitable excipients such as salts, buffers, stabilizers and the like may, 

25 but need not, be present within the composition. Liposomal suspensions may also be 
suitable as pharmaceutically acceptable carriers. These may be prepared according to 
methods known to those skilled in the art. 

A TM may be prepared with carriers that protect it against rapid 
elimination from the body, such as time release formulations or coatings. Such carriers 

30 include controlled release formulations, such as, but not limited to, implants and 
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microencapsulated delivery systems, and biodegradable, biocompatible polymers, such 
as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, polyorthoesters, polylactic 
acid and others. 

A pharmaceutical composition is generally formulated and administered 
5 to exert a therapeutically useful effect while minimizing undesirable side effects. The 
number and degree of acceptable side effects depends upon the condition for which the 
composition is administered. For example, certain toxic and undesirable side effects are 
tolerated when treating life-threatening illnesses, such as tumors, that would not be 
tolerated when treating disorders of lesser consequence. The concentration of 

10 biological agent in the composition will depend on absorption, inactivation and 
excretion rates thereof, the dosage schedule and the amount administered, as well as 
other factors known to those of skill in the art. 

The composition may be administered one time, or may be divided into a 
number of smaller doses to be administered at intervals of time. The precise dosage and 

15 duration of treatment is a function of the disease being treated and may be determined 
empirically using known testing protocols or by extrapolation from in vivo or in vitro 
test data. Dosages may also vary with the severity of the condition to be alleviated. For 
any particular subject, specific dosage regimens may be adjusted over time according to 
the individual need of the patient. 

20 The following examples are offered by way of illustration and not by 

way of limitation. 
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EXAMPLES 



Example 1 
Preparation of Targeting Molecules 

5 

This example illustrates the preparation of representative targeting 

molecules. 

A. Purification of Representative TMs from Biological Sources 

Preparation of dimeric IgA (dlgA). Ten ml of human IgA myeloma 

10 plasma (International Enzymes, Inc., Fallbrook, California) is mixed with an equal 
volume of PBS, and 20 ml of saturated ammonium sulfate (in H 2 0) is added drop wise 
with stirring. After overnight incubation at 4°C, the precipitate is pelleted by 
centrifugation at 17,000 x g for 15 minutes, and the supernatant fraction is discarded. 
The pellet is resuspended in 2 ml PBS. The resulting fraction is clarified by 

15 centrifugation at 13,500 x g for 5 minutes and passage through a 0.45p,m filter (Nylon 
66, 13mm diameter, Micron Separations, Inc., Westborough, Massachusetts). Two ml 
(about half) of the clarified fraction is applied to a Sephacryl® S-200 column (1.6 x 51 
cm; 0.25 ml/min PBS+ 0.1% sodium azide) (Pharmacia, Piscataway, New Jersey), and 
2 ml fractions are collected. Those fractions found to have the highest concentrations of 

20 dlgA (by SDS-PAGE analysis of 10 jal of each fraction) are lyophilized, resuspended in 
200 \x\ deionized H 2 0, and applied to a Superose® 6 column (1.0 x 30 cm; 0.25 ml/min 
PBS+0.1% sodium azide) (Pharmacia, Piscataway, New Jersey). One ml fractions are 
collected and analyzed by SDS-PAGE. Fraction 13 is found to contain dlgA at over 
90% purity. 

25 Preparation of J chain by mild reduction of dlgA. A 1 ml sample 

containing less than 10 mg of dlgA is prepared as described above and dialyzed against 
buffer containing 100 mM sodium phosphate pH 6.0 and 5 mM EDTA. Six mg 2- 
mercaptoethylamine HC1 are added to yield a final concentration of 0.05M, and the 
sample is incubated at 37°C for 90 minutes. The reduced protein is passed over a 
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desalting column equilibrated in PBS + ImM EDTA. The protein-containing fractions 
are detected by assay with BCA reagent. J chain is then further purified by gel filtration 
and ion exchange chromatography. 

Preparation of secretory IgA (slgA). One hundred ml of human breast 
5 milk (Lee Scientific, Inc., St. Louis, Missouri) is mixed with 100 ml PBS and 
centrifuged at 17,000 x g for 1 hour at 4°C. The clear layer below the fat is transferred 
to clean centrifuge bottles and centrifuged at 17,000 x g for 30 minutes at 4°C. The pH 
of the sample is adjusted to 4.2 with 2% acetic acid. After incubation at 4°C for 1 hour, 
the sample is centrifuged at 17,000 x g for 1 hour at 4°C, and the supernatant fraction is 
10 transferred to new tubes and adjusted to pH 7 with 0.1M NaOH. An equal volume of 
saturated ammonium sulfate is added, with stirring, and the sample is incubated at 4°C 
overnight. The precipitated material is pelleted by centrifugation (17,000 x g, 90 
minutes, 4°C), resuspended in approximately 7 ml PBS, and dialyzed extensively 
against PBS at 4°C. 

15 Of the resulting approximately 25 ml, 1.1 ml is further purified. 

Undissolved solids are removed by centrifugation (13,500 x g, 10 minutes) and an equal 
volume of 0.05 M ZnS0 4 is added to the clarified supernatant fraction. The pH is 
adjusted to 6.85 by addition of approximately 40 pi 1 M NaOH. After allowing the 
material to sit for 5 minutes at room temperature, the sample is centrifuged at 13,500 x 

20 g for 10 minutes at room temperature. One and a half ml of the supernatant is mixed 
with 1.5 ml of saturated ammonium sulfate and allowed to stand at 4°C for 1 hour. 
Precipitating material is pelleted by centrifugation (13,500 x g, 10 minutes, room 
temperature) and is found to be greater than 90% slgA by SDS-PAGE analysis. 

Preparation of a molecule consisting of nicked J-chain crosslinked to 

25 two alpha-chain-derived peptides (CNBr cleavage fragment). A pellet containing slgA 
prepared as described above ("Preparation of slgA") is resuspended in 375 jal deionized 
H 2 0. The sample is transferred to a glass vial and the vial is filled almost to the rim 
with 875 jal formic acid. Approximately 20 mg solid CNBr is added and a Teflon 
septum is used to seal the vial. The reaction is allowed to proceed at 4°C overnight. 

30 The sample is then dialyzed against deionized H 2 0 (two changes) and against PBS at 
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4°C, and lyophilized, resuspended with 200 jlxI H 2 0, and applied to a Superose® 6 
column (1.0 x 30 cm, 0.25 ml/min PBS + 0.1% sodium azide). One ml fractions are 
collected. The fractions containing J chain are identified by immunoblotting of SDS- 
PAGE-separated proteins from aliquots of each fraction. 
5 The fraction with the highest concentration of J chain is passed through a 

PD-10 column (Pharmacia, Uppsala, Sweden) equilibrated in 50 mM Tris-CL pH 8.1, 
and applied to a 20 PI Poros anion exchange column (4.6 mm x 100 mm; PerSeptive 
Biosy stems, Inc., Framingham, Massachusetts). The column is washed with 10 ml of 
50 mM Tris-Cl pH 8.1, and eluted with a linear 0 - 1.0 M NaCl gradient in 50 mM Tris- 

10 CI pH 8.1 (15 ml gradient). Elution of proteins from the column is monitored as 
absorbance at 280 nm and the J chain-containing fractions are identified by 
immunoblotting of SDS-PAGE-separated aliquots. 

Alternative Methods for J Chain Purification. A variety of sources are 
suitable as starting material for isolation of human J chain. Polymeric IgA from sera of 

15 patients with IgA multiple myeloma, secretory IgA or IgM from sera of patients with 
Waldenstroms macroglobulinemia, as well as secretory IgA from human breast milk 
can be used as starting material for purification of J chain. Although the differences in 
the molecular weights of J chain (16,000) and L chains (22,500) should be large enough 
to allow satisfactory separation of these two chains by gel filtration, the unique 

20 conformation of J chain and its ability to dimerize often results in co-elution of J chain 
with L chain. Isolation procedures take advantage of J chain's negative charge (due to 
the high content of aspartic and glutamic acid residue) further increased by S- 
sulfitolysis or alkylation of reduced cysteine residues with iodoacetic acid. J chain can 
be subsequently separated from H and L chains by DEAE- or CM-cellulose 

25 chromatography using a linear salt gradient or by preparative electrophoresis in the 
presence or absence of dissociating agents. 

Purification on DEAE-cellulose, which results in the isolation of 
immunochemically and physicochemically homogeneous J chain. As a starting 
material, the J chain-containing L chain fraction of polymeric IgA, S-IgA, or IgM, 

30 obtained by partial oxidative sulfitolysis and subsequent gel filtration on Sephadex® G- 
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200 in 5 M guanidine-HCl can be used. Alternatively, S-sulfonated IgA or S-IGA can 
be directly applied on DEAE-cellulose. However, it is usually necessary to perform an 
additional separation using gel filtration on Sephadex® G-200 in 5 M guanidine-HCl to 
remove contaminating H chains. 
5 Starting materials consist of the following reagents: L chain fraction of 

serum polymeric IgA or IgM, or colostral S-IgA; 0.01 M disodium phosphate in 
deionized 8 M urea solution and the same buffer with 0.7 M NaCl; DEAE-cellulose 
equilibrated in 0.01 M disodium phosphate containing 8 M urea; Sephadex® G-25 
column in 1% NH 4 HC0 3 solution. 

1 0 Lyophilized L chain fraction is dissolved in 0.01 M disodium phosphate 

in 8 M urea, and applied on a DEAE-cellulose column equilibrated in the same 
phosphate solution. The column is thoroughly washed with this buffer. Absorbed 
proteins are eluted with a linear gradient of 0.01 M disodium phosphate in 8 M urea and 
0.01 M disodium phosphate with 0.7 M NaCl. Two fractions are obtained, the later 

15 fraction containing J chain. 

The J chain-containing fraction is desalted on a Sephadex® G-25 column 
in 1% NH 4 HC0 3 adjusted to neutrality by bubbling with C0 2 . The purity of J chain can 
be assessed by alkaline-urea gel-electrophoresis or immunoelectrophoresis with anti- L, 
H, and J chain reagents. 

20 B. Direct Synthesis of TM Polypeptides 

Manual syntheses are performed with BOC-L-amino acids purchased 
from Biosearch-Milligen (Bedford, Massachusetts). Machine-assisted syntheses are 
performed with BOC-L-amino acids from Peptide Institute (Osaka, Japan) and Peptides 
International (Louisville, Kentucky). BOC-D-amino acids are from Peptide Institute. 

25 BOC-L-His(DNP) and BOC-L-Aba are from Bachem Bioscience (Philadelphia, 
Pennsylvania). Boc-amino acid-(4-carboxamidomethyl)-benzyl-ester-copoly (styrene- 
divinylbenzene) resins [Boc-amino acid-OCH2-Pam-resins] are obtained from Applied 
Biosystems (Foster City, California) and 4-methylbenzhydrylamine (4MeBHA) resin is 
from Peninsula Laboratories, Inc. (Belmont, California). Diisopropylcarbodiimide 
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(DIC) is from Aldrich, and 2-(IH-benzotriazol-t-yl)-l, 1,3,3- 
tetramethyluroniumhexafluorophosphate (HBTU) is obtained from Richelieu 
Biotechnologies (Quebec, Canada). For manual syntheses NN-diisopropylethylamine 
(DIEA), NN-dimethylformamide (DMF), dichloromethane (DCM) (all peptide 
5 synthesis grade) and 1-hydroxybenzotriazole (HOBT) are purchased from Auspep 
(Melbourne, Australia). For machine-assisted syntheses, DIEA and DCM are from 
ABI, and DMF is from Auspep. Trifluoroacetic acid (TFA) is from Halocarbon (New 
Jersey). Acetonitrile (HPLC grade) is obtained from Waters Millipore (Milford, 
Massachusetts). HF is purchased from Mallinckrodt (St. Louis, Missouri). Other 

10 reagents and solvents are ACS analytical reagent grade. Screw-cap glass peptide 
synthesis reaction vessels (20 mL) with a # 2 sintered glass filter frit are obtained from 
Embel Scientific Glassware (Queensland, Australia). A shaker for manual solid phase 
peptide synthesis is obtained from Milligen (Bedford, Massachusetts). An all-Kel F 
apparatus (Toho; from Peptide Institute, Osaka) is used for HF cleavage. Argon, helium 

15 and nitrogen (all ultrapure grade) are from Parsons (San Diego, California). 

Chain assembly. Syntheses are carried out on Boc-amino acid-OCH2- 
Pam-resins, or on 4-MeBHA-resin. Boc amino acids are used with the following side 
chain protection: Arg(Tos); Asp(OBzl) (manual synthesis) and Asp(OcHxl); Cys(Bzl) 
(machine-assisted synthesis); Asn, unprotected (manual synthesis) and Asn(Xan) 

20 (machine-assisted synthesis); Glu(OcHxl); His(DNP); Lys(2CIZ); Thr(Bzl); 
Trp(InFormyl); and Tyr(BrZ). Gin and Met are used side chain unprotected. 

Manual protocol Syntheses are carried out on a 0.2 mmol scale. The 
N a -Boc group is removed by treatment with 100 % TFA for 2 x 1 minute followed by a 
30 second flow with DMF. Boc amino acids (0.8 mmol) are coupled, without prior 

25 neutralization of the peptide-resin salt, as active esters preformed in DMF with either 
HOBt/DIC (30 minute activation), or HBTU/ DIEA (2 minute activation) as activating 
agents. For couplings with active esters formed by HOBt/DIC, neutralization is 
performed in situ by adding 1.5 equivalents of DIEA relative to the amount of TFA O" 
* + NH3 -peptide-resin salt to the activated Boc-amino acid/resin mixture. For couplings 

30 with active esters formed from HBTU/DIEA, an additional 2 equivalents DIEA relative 
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to the amount of TFA O "- + NH3-peptide-resin salt are added to the activation mixture. 
Coupling times are 10 minutes throughout without any double coupling. Samples (3-5 
mg) of peptide-resin are removed after the coupling step for determination of residual 
free Boc-amino groups by the quantitative ninhydrin method. Coupling yields are 
5 typically > 99.9%. All operations are performed manually in a 20 mL glass reaction 
vessel with a Teflon-lined screw cap. The peptide-resin is agitated by gentle inversion 
on a shaker during the NH-deprotection and coupling steps. 

Deprotection and cleavage. His(DNP)-containing peptides are treated 
with a solution of 20% mercaptoethanol/10% DIE A in DMF for 2 x 30 minutes in order 

1 0 to remove the DNP group, prior to the removal of the Boc group. The N a -Boc group is 
removed from the peptide-resin by treatment with neat TFA (2x1 minute). The 
peptide-resin is washed with DMF and neutralized with 10% DIE A in DMF (lxl 
minute). After removal of the DNP and Boc group, the peptide-resin is treated with a 
solution of ethanolamine in water/DMF for 2 x 30 minutes to remove the formyl group 

15 of Trp(InFormyl). 

The partially-deprotected peptide-resin is dried under reduced pressure 
after washing with DMF and DCM. Side chain protecting groups are removed and 
simultaneously the peptide is cleaved from the resin by treatment with HF/p-cresol (9:1 
v/v, 0°C, 1 hour) or HF/p-cresol/thiocresol (9:0.5:0.5 by vol., 0°C, 1 hour). The HF is 

20 removed under reduced pressure at 0°C and the crude peptide precipitated and washed 
with ice-cold diethyl ether, then dissolved in either 20% or 50% aqueous acetic acid, 
diluted with H2O and lyophilized. 

Peptide joining. Joining of peptide segments of TM produced by the 
synthetic procedures described above is carried out by chemical ligation of unprotected 

25 peptides using previously described procedures (Baca, et al., J.A.C.S. 777:1881-1887, 
1995; Dawson, et al., Science 266:116-119, 1994). These procedures can yield a free 
sulfhydryl at the junctional peptide bond or can yield a disulfide bond. Alternatively, 
cysteine residues at specified positions are replaced by L-aminobutyric acid. 

In one procedure, a synthetic segment peptide 1, which contains a 

30 thioester at the a-carboxyl group, undergoes nucleophilic attack by the side chain thiol 
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of the Cys residue at the amino terminus of peptide 2. The initial thioester ligation 
product undergoes rapid intramolecular reaction because of the favorable geometric 
arrangement (involving a five-membered ring) of the a-amino group of peptide 2, to 
yield a product with the native peptide bond of a cysteine moiety at the ligation site. 
5 Both reacting peptide segments are in completely unprotected form, and the target 
peptide is obtained in final form without further manipulation. Additional cysteine 
residues in either peptide 1 or peptide 2 are left in their reduced state. The procedure is 
referred to herein as native chemical ligation. 



10 nucleophilic attack of a deprotonated a-thioacid group on a bromoacetyl moiety to form 
a dimer chemically ligated via thioester. In addition, C terminal cysteamine moieties 
can be joined to N-terminal mercaptoacetyl groups after derivatization of the 
cysteamine-containing monomer with 2,2'-dipyridyl disulfide. A disulfide-linked dimer 
is formed by thiolysis of the S-(2-pyridyisulfenyl) cysteamine derivative. 

1 5 These procedures are used to derive a variety of TM configurations, such 

as the representative TMs provided below. The TM core consists of residues 12-101 
and the extended TM consists of residues 1-136. 



In another procedure, unprotected peptide segments are ligated via 



Table II 
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Direct Synthesis of TM Polypeptides 



Segments 



Chemistry 



Strategy to form Representative 
Closed Covalent Loop Attachment Sites 



A. TMCore 
1. 12-71 



N-cysteine 71 to 91 via disulfide sulfhydryls at 14 

C-glyNH 2 CH 2 CH 2 SH linker; 12 to 101 via and 68 

N-glyCOCH 2 SH renaturation and 

C-cysteine oxidation to disulfide 



2. 91-101 



B. TMCore 
1. 31-71 



N-BrCH 2 CO 71 to 91 via disulfide sulfhydryls at 14 

C-glyNH 2 CH 2 CH 2 SH linker; 30 to 3 1 via and 68 
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Strategy to form 


Representative 


Segments 


Chemistry 


Closed Covalent Loop 


Attachment Sites 


2. 


91-[101-12]-30 


N-glyCOCH 2 SH 


thioester; 12 to 101 








C-thioacid 


exists as peptide 










bonds (serine-glycine- 










alanine in place of cys 










to cys disulfide) 




C. 


TM Extended 








1. 


1-67 


N-NH3+ 


67 to 68 via native 


sulfhydryls at 14 






C - thioester 


chemical ligation; 1 1 8 


and 68 


2. 


68-118 


N - cysteine 


to 1 19 via thioester; 








C - thioacid 


71 to 91, 12 to 101 




3. 


119-136 


N - BrCH 2 CO 


and 108 to 133 via 








c - coo- 


renaturation and 










oxidation to form 










disulfides 




D. 


TM Core Variations 






1. 


serine 68 


Same as A or B 


Same as A or B 


sulfhydryl at 14; 




serine 14 


it 


tt 


sulfhydryl at 68; 


2. 


serine 68 


n 




free amines or free 




+ serine, 14 






carboxyls 


E. 


TM Extended Variations 






1, 


1-70 


N-NH3+ 


70 to 71 via native 


reactive group at 






C - thioester 


chemical ligation; 118 


136 for attachment 










of 




71-118 


N - cysteine 


to 1 19 via thioester; 


N-mercapto- 






C - thioacid 


71 to 91, 12 to 101 


acetylated 










peptide linker 




119-136 


N - BrCH 2 CO 


and 108 to 133 via 








C - glyNH 2 CH 2 CH 2 SH 


renaturation and 










oxidation to form 










disulfides; serines at 










14 and 68 




2. 


1-70 


N - BrCH 2 CO 


70 to 71 via native 


reactive group at 1 






C - thioester 


chemical ligation; 118 


for attachment of 




71-118 


N - cysteine 


to 119 via thioester; 


C-thioester peptide 






C - thioacid 


71-91, 12 to 101 and 


linker 




119-136 


N - BrCH 2 CO 


and 108 to 133 via 








c - coo- 


renaturation and 










oxidation to form 





disulfides; serines at 

14 and 68 

"Extended" = a TM comprising the 88 residues of the core, plus an additional 48 
residues derived from native J chain; "Core" = residues 12-101 of native J chain; 
residues are indicated according to the numbering in Figure 1 
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C. Synthesis and Expression of Synthetic DNAs Encoding TM 

DNA chains can be synthesized by the phosphoramidite method, which 
is well known in the art, whereby individual building block nucleotides are assembled 
5 to create a desired sequence. Automated DNA synthesis of TM DNAs involves the 
synthesis and joining of individual oligonucleotides encoding portions of TMs to form 
the entire desired sequence. Synthetic DNA can be purchased from a number of 
commercial sources. 

Transgenic expression of TMs requires ligation of the synthetic coding 

10 DNA into a vector for transformation of the appropriate organism. Techniques of 
ligation into vectors are well described in the literature. For example, in order to enable 
the introduction and expression of TMs in insect cells, the synthetic TM DNA is ligated 
into the pFastBacl vector (GibcoBRL) to form the pFastBacl-TM recombinant. The 
recombinant vector is then used to transform E. coli bacteria containing a helper 

15 plasmid and a baculovirus shuttle vector. High molecular weight shuttle vector DNA 
containing transposed TM coding sequences is then isolated and used for transfection of 
insect cells. Recombinant baculovirus are harvested from transfected cells and used for 
subsequent infection of insect cell cultures for protein expression. 

A TM can be synthesized by expressing in cells a DNA molecule 

20 encoding the TM. The DNA can be included in an extrachromosomal DNA element or 
integrated into the chromosomal DNA of the cell expressing the TM. Alternatively, the 
TM DNA can be included as part of the genome of a DNA or RNA virus which directs 
the expression of the TM in the cell in which it is resident. An example of a DNA 
sequence encoding TM is shown in SEQ ID NO:7. This DNA sequence and the amino 

25 acid sequence (SEQ ID NO: 17) encoded by this TM DNA are also shown in Table III. 

One. method of synthesizing such a TM gene involves the sequential 
assembly of oligonucleotides encoding portions of the TM gene into a complete TM 
gene. The final assembly of the TM gene can occur in a DNA expression vector 
suitable for expression in a cellular system, or the TM gene can be constructed in a 
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convenient cloning vector and subsequently moved into a DNA expression vector 
suitable for expression in a cellular system. An advantage of the sequential assembly of 
the TM gene from partial coding regions is the ability to generate modified versions of 
the TM gene by using alternative sequences for one or more of its individual portions 
5 during the assembly of the TM gene. Alternatively, the restriction endonuclease sites 
encoded in the TM gene can be used after the assembly of part or all of the TM gene to 
replace portions of the TM coding sequence to generate alternative TM coding 
sequences, using well known techniques, as described by Sambrook et al., Molecular 
Cloning: A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory, Cold Spring 

10 Harbor, New York, 1989. The TM gene can be divided into several partial coding 
regions: Dl encoding amino acids approximately -2 to 20; C2 encoding amino acids 
approximately 19 to 66; L3 encoding amino acids approximately 65 to 102; and T4 
encoding amino acids approximately 102 to 142 of the sequence recited in Table III. 
Unless otherwise indicated, references to amino acid residue numbers in the following 

1 5 section are to the residue indicated in Table III. 

Assembly of a synthetic gene encoding TM Core polypeptide. A TM 
Core gene sequence may be defined by the combination of C2, Dl.l (a modified 
version of Dl, and L3A (a modified version of L3). One version of TM Core may be 
generated from the oligonucleotides 1.1, 2.1, 3, 4, 5, 6, 7, 8, 9L3A and 10L3A (SEQ ID 

20 NOS:48, 49, 54-56, 58, 60, 61, 63, 64) listed in Table IV and encodes a polypeptide of 
sequence: 

DQKCKCARITSRIIRSSEDPNEDIVERNIRIIVPLNNRENISDPTSPLRTRFVYHLSD 
LCKKDEDSATETC (Table IX and SEQ ID NO: 18). A gene containing Dl.l, C2, and 
L3A or alternate coding sequences that differ only in conservative substitutions or 

25 modifications is a complete TM Core gene. 

Assembly of C2. In one example, de novo synthesis of a TM gene 
(including the TM core) may be initiated by assembly of a partial gene, called C2, 
encoding amino acids 19-66 of the TM. The sequence of C2 DNA and the peptide 
sequence encoded by the C2 DNA are shown in Table V and SEQ ID NOS:9 and 19. 

30 C2 is generated by annealing oligonucleotides 3, 4, 5, 6, 7 and 8 (SEQ ID NOS:54, 55, 
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56, 58, 60, and 61, respectively) of Table IV into a DNA fragment encoding 
approximately 48 amino acids of the TM Core polypeptide. Oligonucleotide pairs 3&4, 
5&6, and 7&8 are first annealed pairwise into overlapping DNA duplexes, and the 3 
double stranded DNAs are then annealed together to form a double stranded DNA 
5 complex composed of the 6 individual oligonucleotides. Oligonucleotides 1 and 8 have 
overhanging unpaired ends compatible with the unpaired ends of DNA restricted with 
the enzymes Xba I and Bgl II, respectively. C2 is annealed into the vector pMelBac 
XP, at the Xba I and Bgl II restriction endonuclease sites of the multiple cloning region 
and the DNA fragments enzymatically ligated to form the vector pTMC (Method 1). 

10 Method 1: Synthesis of C2 DNA from oligonucleotides and insertion 

into pMelBac XP to form pTMC. Individual oligonucleotides 3, 4, 5, 6, 7, and 8 (SEQ 
ID NOS:54-56, 58, 60, 61) are separately dissolved in TE buffer (Sambrook et al., 
Molecular Cloning: A Laboratory Manual, 2d ed. 5 Cold Spring Harbor Laboratory, 
Cold Spring Harbor, New York, 1989) at a concentration of ImM (1 

15 nanomole/microliter). Two nanomoles of each oligonucleotide are combined with the 
same amount of its pair {e.g., (3&4), (5&6) or (7&8)) in 10 of annealing buffer (10 
mM Tris pH 8.0, 100 mM NaCl, 1 mM EDTA) in a microcentrifuge tube, and the tubes 
immersed in 50mL boiling water for 5 minutes. The entire boiling water bath, 
including microcentrifuge tubes, is then removed from the heat source and allowed to 

20 cool to room temperature (approximately 24°C), allowing the oligonucleotides to form 
base-paired DNA duplexes. After incubating for 30 minutes at room temperature, 1 
nanomole of each oligonucleotide pairs {e.g., (3&4), (5&6), and (7&8)) are combined 
in a single microcentrifuge tube. The tube containing these DNA duplexes is incubated 
at 55°C for 15 minutes in a heating block, removed from the heating block and 

25 equilibrated to room temperature, allowing overlapping complementary regions of the 
DNA duplexes to anneal, forming a DNA duplex encoding the partial TM DNA C2. 

One nanomole of the oligonucleotide duplex is then mixed with 0.1 
picomole of pMelBac XP which has previously been restricted with endonucleases Xba 
I and Bgl II. pMelBac XP is a DNA vector for cloning and subsequent expression in 

30 insect cells of synthetic TM genes, derived from pMelBac B (Invitrogen, San Diego, 
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California). The sequence of the secretion signal and multiple cloning site is (SEQ ID 
NOS:42 and 43): 

met lys phe leu val asn val ala leu val phe met val tyr 
atg aaa ttc tta gtc aac gtt gcc ctt ttt atg gtc gta tac 
5 ile ser tyr ile tyr ala asp pro ser ser ser ala 

att tct tac ate tat gcg gat ccg age teg agt get eta ga 
tct gca get ggt acc atg gaa ttc gaa get tgg agt cga etc 
tgc tga 

The mixture of vector DNA and synthetic gene fragment is then heated 

10 to 35°C for 15 minutes, then 1/10 volume of Ligation Stock Buffer is added, DNA 
ligase is added and the reaction mixture incubated at 12°C for 12 hours to ligate the 
phosphodiester bonds among oligonucleotides and vector DNA, as described in 
Sambrook et aL, Molecular Cloning: A Laboratory Manual, 2d ed., Cold Spring Harbor 
Laboratory, Cold Spring Harbor, New York, 1989. DNA is then used for transfection 

15 of competent E. coli cells by standard methods (see Sambrook et aL supra). Plasmid 
DNA is isolated from these cells and is evaluated by restriction endonuclease digestion 
or DNA sequencing to evaluate the success of synthetic DNA assembly and cloning. 
The resulting plasmid, pTMC is then used as a framework for successive addition of 
synthetic TM sequences. 

20 Assembly of DL1 and insertion into the TM synthetic gene. A fragment 

of the TM DNA proximal to C2, called Dl.l, encodes amino acids 9 to 20 of the TM. 
The DNA sequence and primary amino acid peptide sequence of DL1 are shown in 
Table VI and SEQ ID NOS:10 and 20. Dl.l encodes the proximal amino acids of the 
TM Core polypeptide (residues 12 to 20) as well as a short peptide of three amino acids 

25 which serve to join the TM Core with a leader peptide (appropriate for the expression 
system employed for synthesis of TM). Dl.l is generated by annealing 
oligonucleotides LI and 2.1 (SEQ ID NOS: 48 and 51, respectively) into a DNA duplex 
as described in Method 1. Oligonucleotides 1.1 and 2.1 have overhanging unpaired 
ends compatible with the unpaired ends of BamHI (or Bgl II) and Xba I, respectively. 

30 Dl.l is annealed into pTMC at the BamHI and Xba I restriction endonuclease sites of 



45 



the multiple cloning region and the DNA fragments enzymatically ligated, in a manner 
similar to that described in Method 1 for pTMC, to form the vector pTMDl.lC. 

Assembly of L3A and insertion into the TM synthetic gene. A fragment 
of the TM DNA distal to C2, called L3A, encodes a contiguous polypeptide of amino 
5 acids 66-70 and 92-101 of the TM provided in Table III. The DNA sequence and 
peptide sequence of L3 are shown in Table VII and SEQ ID NOSrll and 21. L3A is 
generated by annealing oligonucleotides 9L3A andl0L3A (SEQ ID NOS:63 and 64, 
respectively) into a DNA duplex as described in Method 1 to generate the distal portion 
of the TM Core DNA encoding approximately 14 amino acids. Oligonucleotides 9L3A 

10 andl0L3A have overhanging unpaired ends compatible with the unpaired ends of Bgl II 
and EcoRI, respectively. L3A is ligated into the vector pTMDLIC at the Bgl II and 
EcoRI restriction endonuclease sites and the DNA fragments enzymatically ligated, in a 
manner similar to that described in Method 1 for pTMC, to form the vector pTMCore. 

A TM may also be synthesized as described above, except that L3 

15 (discussed below) is used in place of L3A. The sequence of such a TM is provided in 
Table X and SEQ ID NO: 13. 

Assembly of a synthetic gene encoding a full length TM polypeptide. A 
full length TM gene sequence may be defined by the combination of Dl, C2, L3 and 
T4. One example of a full length TM gene (SEQ ID NO:7) is generated from the 

20 oligonucleotides 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, and 16 (SEQ ID NOS:46, 
47, 54-56, 58, 60-62, 73-79, respectively) listed in Table IV. A gene containing Dl, 
C2, L3, and T4 or coding sequences that differ only in conservative substitutions or 
modifications is a full length TM gene. 

Assembly of Dl and insertion into the TM synthetic gene. A fragment 

25 of the TM DNA proximal to C2, called Dl, encodes amino acids -2 to 20 of the TM. 
The DNA sequence and peptide sequence of Dl are shown in Table VI. A and SEQ ID 
NOS:15 and 25. Dl encodes the proximal amino acids of the TM Core polypeptide 
(residues 12 to 20) as well as a peptide of 13 amino acids which serves to join the TM 
Core with a leader peptide (appropriate for the expression system employed for 

30 synthesis of TM). Dl is generated by annealing oligonucleotides 1 and 2 (Table IV). 
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Oligonucleotides 1 and 2 have overhanging unpaired ends compatible with the unpaired 
ends of BamHI (or Bgl II) and Xba I, respectively. Dl is annealed into pTMC at the 
BamHI and Xba I restriction endonuclease sites of the multiple cloning region and the 
DNA fragments enzymatically ligated, in a manner similar to that described in Method 
5 1 for pTMC, to form the vector pTMDC. 

Assembly of L3 and insertion into the TM synthetic gene. A fragment 
of the TM DNA distal to C2, called L3, encodes amino acids 66-101 of TM. The DNA 
sequence and peptide sequence of L3 are shown in Table VILA and SEQ ID NOS:14 
and 24. L3 is generated by annealing oligonucleotides 9, 10, 11, and 12 (SEQ ID 

10 NOS:62, 73-75, respectively) (Table IV) into a DNA duplex to generate the distal 
portion of the TM Core DNA encoding approximately 35 amino acids. Oligonucleotide 
pairs 9&10 and 11&12 are first annealed together to form a double stranded DNA 
complex composed of the 4 individual oligonucleotides. Oligonucleotides 9 and 12 
have overhanging unpaired ends compatible with the unpaired ends of Bgl II and Pst I, 

15 respectively. L3 is annealed into the vector pTMDC at the Bgl II and PstI restriction 
endonuclease sites and the DNA fragments enzymatically ligated, in a manner similar to 
that described in Method 1 for pTMC, to form the vector pTMDCL. 

Assembly of T4 and insertion into the TM synthetic gene. A fragment 
of the TM DNA distal to L3, called T4, encodes amino acids 102-141 of the TM. The 

20 DNA sequence and peptide sequence of L4 are shown in Table VIII and SEQ ID 
NOS:12 and 22. L3 is generated by annealing oligonucleotides 13, 14, 15, and 16 (SEQ 
ID NOS:76-79, respectively) (Table IV) into a DNA fragment which is the distal 
portion of the full length TM DNA encoding approximately 36 amino acids. 
Oligonucleotide pairs 13&14 and 15&16 are first annealed pairwise into overlapping 

25 DNA duplexes, and the two double stranded DNAs are subsequently annealed together 
to form a double stranded DNA complex composed of the 4 individual 
oligonucleotides. Oligonucleotides 13 and 16 have overhanging unpaired ends 
compatible with the unpaired ends of Pst I and EcoRI, respectively. T4 is annealed 
into the vector pTMDCL at the Pst I and Eco RI restriction endonuclease sites and the 
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DNA fragments enzymatically ligated, in a manner similar to that described in Method 
1 for pTMC, to form the vector pTM. 

Assembly of synthetic genes encoding modified TM polypeptides. Other 
versions of TM genes, in which the peptide sequence is altered from the full length TM 
5 or TM Core, can be synthesized by using alternative oligonucleotides to 1,2, 3, 4, 5, 6, 
7, 8, 9, 10, 11, 12, 13, 14, 15, or 16 (SEQ ID NOS:46, 47, 54-56, 58, 60-62, 73-79, 
respectively) listed in Table IV. These alternative oligonucleotides can be employed 
during synthesis of a partial TM gene, or can be used to generate DNA fragments which 
can replace coding sequences in an assembled TM gene or TM gene fragment by 
10 removing DNA fragments with restriction endonucleases, and replacing the original 
sequence with an alternative coding sequence. In addition, DNA sequences encoding 
polypeptides unrelated to TM can be inserted into the TM coding sequences at various 
positions. 

Assembly of a synthetic gene encoding an aglvcosylated TM 
1 5 polypeptide. In one example oligonucleotides 5 and 6 are replaced during the assembly 
of C2 with oligonucleotides 5.1dg and 6.1dg (SEQ ID NOS:57 and 59) (Table IV) to 
form a new fragment called C2Aglyco. This oligonucleotide substitution results in an 
altered C2 DNA sequence so that the asparagine encoded at residue 48 is changed to a 
histidine. With the exception of the oligonucleotides 5.1dg and 6.1dg, C2Aglyco is 
20 created in the same manner as C2. C2Aglyco can be used in the synthesis of a variety 
TM sequences in a manner similar to that described for TM Core and full length TM 
sequences. 

Assembly of a synthetic gene encoding a TM polypeptide with a 
modified L3 domain. In another example, TM amino acid residues 71-91 are replaced 
25 with the three amino acid peptide: ser-asp-ile. In this example oligonucleotides 9.2A3 
and 10.2 A3 (SEQ ID NOS:67 and 68) (Table IV) are first annealed into a DNA duplex 
and subsequently annealed into the vector pTMDC at the Bgl II and Eco RI restriction 
endonuclease sites. The annealed DNA fragments are then enzymatically ligated to 
form the vector pTMLA3. 
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Assembly of synthetic genes encoding a TM polypeptide with cysteine 
residue 68 replaced. In other examples, the oligonucleotide pairs 9.3A3ser&10.3A3ser 
(SEQ ID NOS:69 and 70) or 9.3A3val&10.3A3val (SEQ ID NOS:71 and 72) are 
annealed into DNA duplexes and digested with the enzyme Clal and subsequently 
5 annealed into pTMLA3 which has been digested with restriction enzymes Clal and PstL 
These two oligonucleotide pairs, when inserted into pTMlA3, result in a TMA3 
molecule with the cysteine at position 68 replaced by serine or valine, respectively. 

Assembly of synthetic genes encoding a TM polypeptide with cysteine 
residue 14 replaced. In another example the oligonucleotide pairs 1.2ser&2.2ser (SEQ 

10 ID NOS:50 and 51) or 1.2val&2.2val (SEQ ID NOS:52 and 53) can be annealed to 
generate an alternative domain to Dl with the cysteine residue 14 replaced with serine 
or valine, respectively. These oligonucleotide pairs are then annealed, in the same 
manner as described above for Dl, into pTMC at the BamHI and Xba I restriction 
endonuclease sites of the multiple cloning region and the DNA fragments enzymatically 

1 5 ligated to form alternatives to the vector pTMD 1 C. 

Assembly of a synthetic gene encoding a TM core polypeptide 
containing an endomembrane retention signal. In a further example TM core is 
synthesized with the endomembrane retention signal KDEL (SEQ ID NO:44) as the 
carboxyterminal amino acid residues. In this example oligonucleotides 9L3 AKDEL and 

20 10L3 AKDEL (SEQ ID NOS:65 and 66) are substituted for oligonucleotides 9L3A and 
10L3A during synthesis of TM core described above to form the vector pTMLA3KDEL. 

Assembly of a synthetic gene encoding a full length TM polypeptide 
containing an endomembrane retention signal. In another example TM is synthesized 
with the endomembrane retention signal KDEL (SEQ ID NO:44) as the 

25 carboxyterminal amino acid residues. In this example oligonucleotides 15KDEL and 
16KDEL (SEQ ID NOS:80 and 81) are substituted for oligonucleotides 15 and 16 as 
described above for synthesis of T4. The substitution of these two oligonucleotides 
results in the formation of coding sequence T4KDEL which when substituted for T4 in 
the above described synthesis of pTM results in the formation of the vector pTMKDEL. 
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Assembly of a synthetic gene encoding a TM polypeptide containing an 
additional amino terminal sequence. In one example a TM gene is synthesized with the 
polyimmunoglobulin receptor sequence from residues 585-600 
(AIQDPRLFAEEKAVAD; SEQ ID NO:45) included as part of the amino terminal 
5 domain. The oligonucleotides PI and P2 (SEQ ID NOS:82 and 83) encode this 
polyimmunoglobulin receptor sequence and amino acid residues of DL PI and P2 have 
overhanging unpaired ends compatible with the unpaired ends of Bam HI and Xbal, 
respectively. The oligonucleotides PI and P2 are annealed into a DNA duplex which 
can be used in place of D 1.1 or Dl in the synthesis of a TM expression vectors as 

10 described above. 

Assembly of a synthetic gene encoding a TM polypeptide in which a 
component of TM is replaced by another peptide domain, TpS2. In this example, a TM 
gene is synthesized with a peptide replacing TM Domains 4, 5 and 6. This peptide, 
referred to as TpS2, encodes an enterokinase cleavable peptide between the terminal 

15 residue of Domain 2 and the coding sequence for the trefoil peptide pS2 (as reported in 
Suemori et al., Proc. Natl Acad Set 88:1 1017-1 1021, 1991). The DNA sequence and 
peptide sequence of TpS2 are shown in Table XI and SEQ ID NOS:16 and 26. TpS2 is 
generated by annealing oligonucleotides Tpl, Tp2, Tp3, Tp4, Tp5 and Tp6 (SEQ ID 
NOS: 103-108, respectively) (Table IV) into a DNA fragment which encodes 

20 approximately 64 amino acids. Oligonucleotide pairs Tpl & Tp2, Tp3 & Tp4 and Tp5 
& Tp6 are first annealed pairwise into overlapping DNA duplexes, and the two double 
stranded DNAs are subsequently annealed together to form a double stranded DNA 
complex composed of the 6 individual oligonucleotides. Oligonucleotides Tpl and Tp6 
have overhanging unpaired ends compatible with the unpaired ends of PstI and EcoRI 

25 restriction sites, respectively. TpS2 is annealed into the vector pTMDCL at the PstI and 
EcoRI restriction endonuclease sites and the DNA fragments enzymatically ligated, in a 
manner similar to that described in Method 1 for pTMC, to form a vector pTMpSp2, 
which encodes a TM with the trefoil peptide pS2 included as a replacement for TM 
Domains 4, 5, and 6. 



50 



D. Isolation and Expression of cDNA Encoding Human J Chain. 

Two human small intestine cDNA libraries (Clontech Laboratories, Palo 
Alto, California; cat #HL1133a and dHL1133b) are screened using a synthetic DNA 
complementary to the 5' end of the human J chain messenger RNA. The probes are 
5 labeled with [ 32 P] using polynucleotide kinase in standard reactions. The library 
screening is performed as described by the manufacturer (Clontech). Hybridization is 
carried out according to Church and Gilbert, Proc. Natl Acad. Sci. USA 57:1991-1995, 
1984. After autoradiography, positive plaques are isolated and the phage are disrupted 
by boiling for 10 minutes. The cDNA inserts are amplified by PCR in a total volume of 

10 50 jiL containing standard PCR buffer, 25 pmoles of primers complementary to the 5' 
and 3' ends of the human J chain cDNA, 200 \jlM of each dNTP, and 1.0 unit of Taq 
polymerase. The DNA is denatured for 3 minutes at 94°C prior to 35 cycles of 
amplification. Each cycle consisted of 1 min at 94°C, 1 min at 62°C, and 1 min at 
72°C. The PCR fragments are cloned into pUC19 and sequenced. Full length cDNA 

1 5 inserts are then subcloned into the appropriate insect expression vector (pMelBacXP) 
utilizing restriction sites placed in the two PCR primers. 
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TABLE III 

DNA Sequence and Primary Amino Acid Structure of a Representative 

Full Length TM Molecule 



-2 


-1 


1 


asp 


gin 


glu 


gat 


cag 


gaa 


eta 


gtc 


ctt 



2 


3 


4 


asp 


glu 


arg 


gat 


gaa 


cgt 


eta 


ctt 


gca 



5 


6 


7 


lie 


val 


leu 


att 


gtt 


ctg 


taa 


caa 


gac 



8 


9 


10 


val 


asp 


asn 


gtt 


gac 


aac 


caa 


ctg 


ttg 



11 


12 


13 


lys 


cys 


lys 


aag 


tgc 


aag 


ttc 


acg 


ttc 



14 15 16 
cys a la a rg 
tgt get cgt 
aca cga gca 



17 


18 


19 


lie 


thr 


ser 


att 


act 


tct 


taa 


tga 


aga 



20 


21 


22 


arg 


He 


He 


aga 


ate 


ate 


tct 


tag 


tag 



23 


24 


25 


arg 


ser 


ser 


cgt 


age 


tea 


gca 


teg 


agt 



26 


27 


28 


glu 


asp 


pro 


gag 


gac 


cca 


etc 


ctg 


ggt 



29 


30 


31 


asn 


glu 


asp 


aat 


gaa 


gat 


tta 


ctt 


eta 



32 33 34 
He val glu 
ata gtc gaa 
tat cag ctt 



35 


36 


37 


arg 


asn 


He 


cgt 


aac 


ate 


gca 


ttg 


tag 



38 


39 


40 


arg 


He 


He 


cgt 


ate 


ate 


gca 


tag 


tag 



41 


42 


43 


val 


pro 


leu 


gtc 


cca 


ctg 


cag 


ggt 


gac 



44 


45 


46 


asn 


asn 


arg 


aat 


aac 


egg 


tta 


ttg 


gec 



47 


48 


49 


glu 


asn 


He 


gag 


aat 


ate 


etc 


tta 


tag 



50 51 52 
ser asp pro 
tea gat cct 
agt eta gga 



53 


54 


55 


thr 


ser 


pro 


aca 


agt 


ccg 


tgt 


tea 


ggc 



56 


57 


58 


leu 


arg 


thr 


ttg 


cgc 


aca 


aac 


gcg 


tgt 



59 


60 


61 


arg 


phe 


val 


cgc 


ttc 


gta 


gcg 


aag 


cat 



62 


63 


64 


tyr 


his 


leu 


tac 


cac 


ctg 


atg 


gtg 


gac 



65 


66 


67 


ser 


asp 


leu 


tea 


gat 


ctg 


agt 


eta 


gac 



68 69 70 
cys lys lys 
tgt aag aag 
aca ttc ttc 



71 


72 


73 


cys 


asp 


pro 


tgt 


gat 


cca 


aca 


eta 


ggt 



74 


75 


76 


thr 


glu 


val 


aca 


gag 


gta 


tgt 


etc 


cat 



77 


78 


79 


glu 


leu 


asp 


gag 


ctg 


gac 


etc 


gac 


ctg 



80 


81 


82 


asn 


gin 


He 


aat 


cag 


ata 


tta 


gtc 


tat 



83 


84 


85 


val 


thr 


ala 


gtc 


act 


gcg 


cag 


tga 


cgc 



86 87 88 
thr gin ser 
act caa age 
tga gtt teg 



89 


90 


91 


asn 


He 


cys 


aac 


att 


tgc 


ttg 


taa 


acg 



92 


93 


94 


asp 


glu 


asp 


gat 


gag 


gac 


eta 


etc 


ctg 



95 


96 


97 


ser 


ala 


thr 


age 


get 


aca 


teg 


cga 


tgt 



99 


100 


101 


glu 


thr 


cys 


gaa 


ace 


tgc 


ctt 


tgg 


acg 



102 


103 


104 


ser 


thr 


tyr 


age 


ace 


tac 


teg 


tgg 


atg 



109 110 111 
asp arg asn 
gat agg aac 
eta tec ttg 



112 


113 


114 


lys 


cys 


tyr 


aaa 


tgc 


tac 


ttt 


acg 


atg 



115 


116 


117 


thr 


ala 


val 


acg 


gee 


gtg 


tgc 


egg 


cac 



118 


119 


120 


val 


pro 


leu 


gtt 


ccg 


etc 


caa 


ggc 


gag 



121 


122 


123 


val 


tyr 


gly 


gtg 


tat 


ggt 


cac 


ata 


cca 



124 


125 


126 


gly 


glu 


thr 


gga 


gag 


aca 


cct 


etc 


tgt 



127 128 129 
lys met val 
aaa atg gtg 
ttt tac cac 



130 131 132 133 134 135 136 137 138 139 140 141 

glu thr ala leu thr pro asp ala cys tyr pro asp OPA 

gaa act gee ctt acg ccc gat gca tgc tat ccg gac tga attc 



52 



ctt tga egg gaa tgc ggg eta cgt acg ata ggc ctg act taag 



53 



TABLE IV 
Oligonucleotides for Construction of 
Representative Partial TM Genes 

5 OLIGO SEQUENCE 

1- gat cag gaa gat gaa cgt att gtt ctg gtt gac aac aag tgc aag tgt 

get cgt att act t 

10 2: eta gaa gta ata cga gca cac ttg cac ttg ttg tea ace aga aca ata 

cgt tea tct tec t 

1.1: gat cag aag tgc aag tgt get cgt att act t 

15 2.1 ct aga agt aat acg age aca ctt gca ctt ct 



25 



30 



35 



45 



50 9 



1.2ser: 


gat 
get 


cag 
cgt 


gaa 
att 


gat 
act 


gaa 
t 


cgt 


att 


gtt 


ctg 


gtt 


2.2ser: 


eta 
cgt 


gaa 
tea 


gta 
tct 


ata 
tec 


cga 
t 


gcg 


gac 


ttg 


cac 


ttg 


1.2val : 


gat 
get 


cag 
cgt 


gaa 
att 


gat 
act 


gaa 
t 


cgt 


att 


gtt 


ctg 


gtt 


2.2va1 : 


eta 
cgt 


gaa 
tea 


gta 
tct 


ata 
tec 


cga 
t 


gca 


ace 


ttg 


cac 


ttg 


3- 


eta 


gaa 


tea 


tec 


gta 


get 


cag 


agg 


ace 


caa 


4: 


gat 

acg 


acg 
gat 


gat 
gat 


gtt 
t 


acg 


ttc 


gac 


tat 


ate 


ttc 


5: 


cgt 
g 


aac 


ate 


cgt 


ate 


ate 


gtc 


cca 


ctg 


aat 


5.1dg: 


cgt 
g 


aac 


ate 


cgt 


ate 


ate 


gtc 


cca 


ctg 


aat 


6: 


acg 
t 


gac 


ttg 


tag 


gat 


ctg 


aga 


tat 


tct 


ccc 


6.1dg: 


acg 
t 


gac 


ttg 


tag 


gat 


ctg 


aga 


tgt 


get 


ccc 


7: 


ate 


eta 


caa 


gtc 


cgt 


tgc 


gca 


cac 


get 


teg 


8: 


gat 


ctg 


aca 


ggt 


ggt 


ata 


cga 


age 


gtg 


tgc 


9: 


gat 
ata 


ctg 
gtc 


tgt 
act 


aag 
gca 


aag 


tgt 


gat 


cca 


aca 


gag 


9L3D: 


gat 


ctg 


tgt 


aag 


aag 


gat 


gag 


gac 


age 


get 
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10 



15 



20 



25 



35 



40 



10L3D: 


aat 


tea 


gca 


ggt 


ttc 


tgt 


age get 


9L3DKDEL: 


gat 
aag 


ctg 
gat 


tgt 
gag 


aag 
ctg 


aag 
tg 


gat 


gag gac 


1 01 3DKDFI ■ 


aat 
ate 


tea 
ctt 


cag 
ctt 


etc 
aca 


ate 
ca 


ctt 


cgc ate 


9.2D3: 


aat 
tgc 


eta 
age 


tat 

uy u 

aca 


aaa 

uuy 

tg 


aaa 

u uy 


tct 


gat ate 


10.2D3: 


aat 
ctt 


tea 
ctt 


tgt 
aca 


get 
ca 


gca 


ggt 


ttc tgt 


9.3D3/ser68: 


gat 
eta 


ctg 
tag 


tct 
eta 


aag 
ctt 


aag 
eta 


tct 
a 


gat ate 


10.3D3/ser68: 


aat 


ctt 


cat 


cga 


tat 


cag 


act tct 


9.3D3/va168: 


gat 
eta 


ctg 
tag 


gtt 
eta 


aag 
ctt 


aag 
eta 


tct 
a 


gat ate 


10.3D3/val68: 


aat 


ctt 


cat 


cga 


tat 


cag 


act tct 


10: 


att 


gtc 


cag 


etc 


tac 


etc 


tgt tgg 


ii : 


act 


caa 


age 


aac 


att 


tgc 


gat gag 


12: 


ggt 
gac 


ttc 
tat 


tgt 
ctg 


age 


get 


ctg 


etc ate 


13: 


gc acc tac gat agg aac aaa tgc ■ 
tat ggt gga gag 


14: 


gag 


egg 


aac 


cac 


ggc cgt gta gca 



15: aca aaa atg gtg gaa act gee ctt acg ccc gat gca tgc tat ccg gac 

tg 

16: aat tea gtc egg ata gca tgc ate ggg cgt aag ggc agt ttc cac cat 

ttt tgt etc tec acc ata cac 

15KDEL: aca aaa atg gtg gaa act gee ctt acg ccc gat gca tgc tat ccg gac 

45 aag gat gaa ttg tg 

16KDEL: aat tea caa ttc ate ctt gtc egg ata gca tgc ate ggg cgt aag ggc 

agt ttc cac cat ttt tgt etc tec acc ata cac 

50 PI: gat cag gtc get gee ate caa gac ccg agg ctg ttc gee gaa gag aag 

gee gtc get gac tec aag tgc aag tgt get cgt att act t 

P2: ct aga agt aat acg age aca ctt gca ctt gga gtc age gac ggc ctt 

etc ttc ggc gaa cag cct egg gtc ttg gat ggc age gac ct 
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Tpl: gc gat gac gac gat aag gcc caa acg gag acc tgt act gtt gcg cct 







cgt 


gaa 


egg 


caa 


aac 


tgc 


gga 


ttc 




Tp2: 


gtt 


ttg 


ccg 


ttc 


acg 


agg 


cgc 


aac 


J 




axe 


gtc 


gtc 


-j + r , 
axe 


get 


xca 








Tp3: 


gta 


aca 


ccc 


tct 


cag 


tgc 


get 


aat 






gta 


egg 


ggc 


gtt 


ccg 


tgg 


tgc 


ttc 


10 


Tp4: 


gcc 


ccg 


tac 


cgt 


gtc 


ate 


aaa 


aca 






ggg 


tgt 


tac 


ttc 


egg 


gaa 


tec 


gca 




Tp5: 


tac 

g 


ccc 


aat 


aca 


att 


gac 


gtt 


ccg 



15 



Tp6: aattc tta egg etc gca etc ttc ttc agg egg caa gtc aat tgt att 

ggg gta gaa gca cca egg aac 



20 



25 



30 



35 



40 



TABLE V 

Peptide and DNA Sequence of Domain C2 of TM 
(TM AA Residues 19-65) 



19 20 21 22 23 24 25 26 27 28 

ser arg lie lie arg ser ser glu asp pro 

ct aga ate ate cgt age tea gag gac cca 

t tag tag gca teg agt etc ctg ggt 

37 38 39 40 41 42 43 44 45 46 

lie arg lie lie val pro leu asn asn arg 

ate cgt ate ate gtc cca ctg aat aac egg 

tag gca tag tag cag ggt gac tta ttg gcc 
«««< oligo #6 



55 56 57 58 59 60 61 62 63 64 65 66 
pro leu arg thr arg phe val tyr his leu ser asp leu 
»»»»» oligo #7 »»»»»»>»»»»» 
ccg ttg cgc aca cgc ttc gta tac cac ctg tea 
ggc aac gcg tgt gcg aag cat atg gtg gac agt eta g 
««/««< oligo #8 



29 30 31 32 33 34 35 36 
asn glu asp lie val glu arg asn 
>»»»»»»»»»»»> i »>»» 

aat gaa gat ata gtc gaa cgt aac 
tta ctt eta tat cag ctt gca ttg 



47 


48 


49 


50 51 


52 


53 54 


glu 


asn 


lie 


ser asp 

»»/>»: 

tea gat 


pro 


thr ser 


gag 


aat 


ate 


cct 


aca agt 


etc 


tta 


tag 


agt eta 


gga 


tgt tea 



amino acid number 
amino acid 
coding strand oligo 
coding strand 
noncoding strand 
noncoding strand oligo 
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TABLE VI 

DNA Sequence and Primary Amino Acid Structure 
of Domain D 1.1 of TM 
(TM AA Residues 9-20) 

5 

9 10 11 12 13 14 15 16 17 18 19 20 
asp gin lys cys lys cys ala arg lie thr ser arg 
»»»»»» ol igo Dl . l»»»»»»»»» 
gat cag aag tgc aag tgt get cgt att act t 
10 tc ttc acg ttc aca cga gca taa tga aga tc 

oligo D2.1< 



TABLE VI.A 

15 DNA Sequence and Primary Amino Acid Structure 

of Domain Dl ofTM 
(TM AA Residues -2-20) 





-2 


-1 


1 


2 


3 4 5 6 


7 8 9 10 11 12 13 14 15 


20 


asp 


gin 


glu 


asp 


glu arg ile val 


leu val asp asn lys cys lys cys ala 




gat 


cag 


gaa 


gat 


gaa cgt att gtt 


ctg gtt gar aac aag tgc aag tgt get 






tc 


ctt 


eta 


ctt gca taa caa 


gac caa ctg ttg ttc acg ttc aca cga 




16 


17 


18 


19 


20 




25 


arg 


lie 


thr 


ser 


arg 






cgt 


att 


act 


t 








gca 


taa 


tga 


aga 


tc 





30 TABLE VII 

Peptide and DNA Sequence of Domain L3A of TM 
(TM AA Residues 66-70 and 92-101) 

66 67 68 69 70 92 93 94 95 96 97 99 100 101 

35 asp leu cys lys lys asp glu asp ser ala thr glu thr cys 0PA 

gat ctg tgt aag aag gat gaa gat tec get aca gaa ace tgc tg 

ac aca ttc ttc eta ctt etc agg cga tgt ctt tgg acg act taa 
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TABLE VILA 
Peptide and DNA Sequence of Domain L3 of TM 
(TM AA Residues 66-101) 



5 


66 


67 


68 


69 


70 71 72 73 74 


75 76 77 78 


79 80 81 




asp 


leu 


cys 


lys 


lys cys asp pro thr 


glu val glu leu 


asp asn gin 




gat 


ctg 


tgt 


aag 


aag tgt gat cca aca 


gag gta gag ctg 


gac aat cag 




eta 


gac 


aca 


ttc 


ttc aca eta ggt tgt 


etc cat etc gac 


ctg tta gtc 


10 


82 


83 


84 


85 


86 87 88 89 90 


91 92 93 94 


95 96 97 




lie 


val 


thr 


ala 


thr gin ser asn ile 


cys asp glu asp 


ser ala thr 




ata 


gtc 


act 


gcg 


act caa age aac att 


tgc gat gag gac 


age get aca 




tat 


cag 


tga 


cgc 


tga gtt teg ttg taa 


acg eta etc ctg 


teg cga tgt 


15 




100 














glu 


thr 


cys 












gaa 


acc 


tgc 












ctt 


tgg 


acg 











TABLE VIII 

20 DNA and Primary Amino Acid Sequence of T4 Fragment 

(TM AA Residues 102-141) 







102 


103 


104 109 


110 


111 


112 113 114 115 116 117 118 119 


120 121 






ser 


thr 


tyr asp 


arg 


asn 


lys cys tyr thr ala val val pro 


leu val 


25 




gc 


acc 


tac gat 


agg 


aac 


aaa tgc tac acg gee gtg gtt ccg 


etc gtg 




acg 


teg 


tgg 


atg eta 


tec 


ttg 


ttt acg atg tgc egg cac caa ggc 


gag cac 




122 


123 


124 


125 126 


127 


128 


129 130 131 132 133 134 135 136 


137 138 




tyr 


giy 


giy 


glu thr 


lys 


met 


val glu thr ala leu thr pro asp 


ala cys 


30 


tat 


ggt 


gga 


gag aca 


aaa 


atg 


gtg gaa act gec ctt acg ccc gat 


gca tgc 




ata 


cca 


cct 


etc tgt 


ttt 


tac 


cac ctt tga egg gaa tgc ggg eta 


cgt acg 




139 


140 


141 














tyr 


pro 


asp 


OPA 










35 


tac 


cct 


gac 


tg 












atg 


gga 


ctg 


act taa 
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TABLE IX 

DNA Sequence and Primary Amino Acid Sequence of a 
Representative TM Core Element 



5 


9 10 11 12 13 


14 


15 16 


17 


18 


19 
















asp gin lys cys lys 


cys 


ala arg 


ile 


thr 


ser 
















gat cag aag tgc aag 


tgt 


get cgt 


att 


act 


tct 
















eta gtc ttc acg ttc 


aca 


cga gca 


taa 


tga 


aga 














10 


20 21 22 23 24 


25 


26 27 


28 


29 


30 


31 


32 


33 


34 


35 


36 




arg ile ile arg ser 


ser 


glu asp 


pro 


asn 


glu 


asp 


ile 


val 


glu 


arg 


asn 




aga ate ate cgt age 


tea 


gag gac 


cca 


aat 


gaa 


gat 


ata 


gtc 


gaa 


cgt 


aac 




tct tag tag gca teg 


agt 


etc ctg 


ggt 


tta 


ctt 


eta 


tat 


cag 


ctt 


gca 


ttg 


15 


37 38 39 40 41 


42 


43 44 


45 


46 


47 


48 


49 


50 


51 


52 


53 




ile arg ile ile val 


pro 


leu asn 


asn 


arg 


glu 


asn 


ile 


ser 


asp 


pro 


thr 




ate cgt ate ate gtc 


cca 


ctg aat 


aac 


egg 


gag 


aat 


ate 


tea 


gat 


cct 


aca 




tag gca tag tag cag 


ggt 


gac tta 


ttg 


gee 


etc 


tta 


tag 


agt 


eta 


gga 


tgt 


20 


54 55 56 57 58 


59 


60 61 


62 


63 


6^ 


65 


66 


67 


68 


69 


70 




ser pro leu arg thr 


arg 


phe val 


tyr 


his 


leu 


ser 


asp 


leu 


cys 


lys 


lys 




agt ccg ttg cgc aca 


cgc 


ttc gta 


tac 


cac 


ctg 


tea 


gat 


ctg 


tgt 


aag 


aag 




tea ggc aac gcg tgt 


gcg 


aag cat 


atg 


gtg 


gac 


agt 


eta 


gac 


aca 


ttc 


ttc 


25 


92 93 94 95 96 


97 


99 100 


101 




















asp glu asp ser ala 


thr 


glu thr 


cys 


OPA 


Eco 


RI 














gat gag gac age get 


aca 


gaa ace 


tgc 


tg 


















eta etc ctg teg cga 


tgt 


ctt tgg 


acg 


act 


taa 















30 

TABLE X 

DNA Sequence and Primary Amino Acid Structure 
of a Representative TM 

35 9 10 11 12 13 14 15 16 17 18 19 
asp gin lys cys lys cys ala arg ile thr ser 
gat cag aag tgc aag tgt get cgt att act tct 
eta gtc ttc acg ttc aca cga gca taa tga aga 

40 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 

arg ile ile arg ser ser glu asp pro asn glu asp ile val glu arg asn 

aga ate ate cgt age tea gag gac cca aat gaa gat ata gtc gaa cgt aac 

tct tag tag gca teg agt etc ctg ggt tta ctt eta tat cag ctt gca ttg 

45 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 

ile arg ile ile val pro leu asn asn arg glu asn ile ser asp pro thr 

ate cgt ate ate gtc cca ctg aat aac egg gag aat ate tea gat cct aca 

tag gca tag tag cag ggt gac tta ttg gec etc tta tag agt eta gga tgt 
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54 


55 


56 


57 58 


59 


60 


61 


62 


63 


64 


65 


66 


67 


68 69 


70 




ser 


pro 


1 eu 


arg thr 


arg 


phe 


WD 1 

va i 


tyr 


m s 


1 eu 


ser 


asp 


1 eu 


cys lys 


1 \/c 

lys 




ay L 


ccy 


L uy 


eye aid 


LyL 


L LL. 


y La 


LdL 


CdL 


L- Ly 




yd L 


CLy 


Ly l a ay 


day 




LCa 


ggc 


aac 


gcg xgu 


gcg 


aag 


Ld L 


a Ly 


gtg 


gac 


agL 


L Ld 


gac 


aca ttc 


ttc 




71 


72 


73 


74 75 


76 


77 


78 


79 


80 


81 


82 


83 


84 


85 86 


87 




cys 


asp 


pro 


thr glu 


val 


glu 


leu 


asp 


asn 


gin 


lie 


val 


thr 


ala thr 


gin 




tgt 


gat 


cca 


aca gag 


gta 


gag 


ctg 


gac 


aat 


cag 


ata 


gtc 


act 


gcg act 


caa 


10 


dCd 


eta 


ggt 


tgt etc 


cat 


etc 


gac 


ctg 


tta 


gtc 


tat 


cag 


tga 


cgc tga 


gtt 




88 


89 


90 


91 92 


93 


94 


95 


96 


97 


99 


100 


101 


102 








ser 


asn 


lie 


cys asp 


glu 


asp 


ser 


ala 


thr 


glu 


thr 


cys 


tyr 


OPA 






age 


aac 


att 


tgc gat 


gag 


gac 


age 


get 


aca 


gaa 


acc 


tgc 


tac 


tga attc 


15 


teg 


ttg 


taa 


acg eta 


etc 


ctg 


teg 


cga 


tgt 


ctt 


tgg 


acg 


atg 


act 





TABLE XI 

DNA and Primary Amino Acid Sequence of TpS2 

20 



101 102 



25 



cys 


ser 


asp 


asp asp asp 


lys 


ala 


gin 


thr glu 


thr cys 


thr 


val 


ala 


pro 




gc 


gat 


gac gac gat 


aag 


gee 


caa 


acg gag 


acc tgt 


act 


gtt 


gcg 


cct 


act 


teg 


eta 


ctg ctg eta 


ttc 


egg 


gtt 


tgc etc 


tgg aca 


tga 


caa 


cgc 


gga 


arg 


glu 


arg 


gin asn cys 


giy 


phe 


pro 


gly val 


thr pro 


ser 


gin 


cys 


ala 


cgt 


gaa 


egg 


caa aac tgc 


gga 


ttc 


ccg 


gaa/gta 


aca ccc 


tct 


cag 


tgc 


get 


gca 


ctt 


gee 


gtt ttg/acg 


cct 


aag 


ggc 


ctt cat 


tgt ggg 


aga 


gtc 


acg 


cga 


asn 


lys 


gly 


cys cys phe 


asp 


asp 


thr 


val arg 


gly val 


pro 


trp 


cys 


phe 


aat 


aaa 


ggc 


tgc tgt ttt 


gat 


gac 


acg 


gta egg 


ggc gtt 


ccg 


tgg 


tgc 


ttc/ 


tta 


ttt 


ccg 


acg aca aaa 


eta 


ctg 


tgc 


cat gec 


ccg/caa 


ggc 


acc 


acg 


aag 



tyr pro asn thr lie asp val pro pro glu glu glu cys glu phe 
35 tac ccc aat aca att gac gtt ccg cct gaa gaa gag tgc gag ccg taa g 

atg ggg tta tgt taa ctg caa ggc gga ctt ctt etc acg etc ggc att cttaa 
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Example 2 
Linkage of Biological Agents to a TM 

This example illustrates the attachment of representative biological 
5 agents to a TM. 

A. Preparation of an anti-Influenza Virus Single Chain Antigen Binding Protein 
(SCABP) Attached to TM 

A TM containing a full length native J chain domain may be attached to 
Ca3-Fv(y+K)-anti-influenza virus SCABP. 

10 Virus culture. Influenza virus A/Puerto Rico/8-Mount Sinai is grown in 

fertilized chicken eggs and concentrated and purified by differential centrifugation. 
Virus is quantified in a plaque assay on Madin-Darby canine Kidney (MDCK) cells 
and, when desired, is inactivated with 0.05% p-propiolactone plus 6 minutes of UV 
irradiation 20 cm from a germicidal lamp. 

15 Production and characterization of anti-influenza virus MAbs. IgA and 

IgG anti-influenza virus MAbs are produced by a mucosal immunization protocol. 
Briefly, BALB/c mice are immunized intragastrically four times over an 8-week period, 
the first three times with 0.5 mg of inactivated influenza virus plus 10 ]Lig of cholera 
toxin (List Biological Laboratories, Inc. Campbell, California). For the last 

20 immunization, cholera toxin is omitted and in addition to intragastric virus 
administration, mice also receive an intravenous booster immunization with 30 (ig of 
inactivated virus. Three days later, mice are sacrificed and their splenic lymphocytes 
are hybridized to SP2/0 murine myeloma cells. Clones are screened for secretion of 
IgA and IgG anti-influenza virus antibody by an enzyme-linked immunosorbent assay 

25 (ELISA). After multiple subclonings, stable IgA secretors are injected intraperitoneally 
into pristane primed Balb-C mice and the ascitic fluid is harvested and the specificities 
of the MAbs are confirmed by Western blotting techniques. The biological activities of 
the MAbs are characterized by determining an ELISA titer, neutralization titer, and 
hemagglutination inhibition activity. 
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10 



Isolation of mRNAs and Synthesis of cDNAs. mRNA derived from cell 
lines producing IgA antibodies is isolated by established procedures using the 
FastTrack™ mRNA isolation kit (Invitrogen). Specific primers are employed to prime 
polymerase chain reactions resulting in the amplification of the Fvy section, the Ca3 
section, and the Fvk section in separate amplification reactions. 

Fv heavy forward primers (SEQ ID NO: 84): 

5' TGGTACGAATTCCAGGT(G/C)(A/C)A(A/G)CTGCAG(G/C)AGTC 
(A/G)G 

Fv heavy back primer (SEQ ID NO:85): 

5' ACAGATATCGGGATTTCTCGCAGACTC 



The forward primer is 32-fold degenerate as indicated by the nucleotides 
15 in parentheses. The back primer encodes the first six amino acid of the CHI constant 
region of the alpha chain. 

Ca3 forward primer (SEQ ID NO:86): 

5' ACAGATATCGTGAACACCTTCCCACCC 

20 Coc3 back primer (SEQ ID NO:87): 

5' ACAAAGCTTTTATTTACCCGACAGACGGTC 

The stop codon for the hybrid transcript is contained in the Ca3 back primer. 

25 Fvk forward primers (SEQ ID NO:88): 

5' GTCCCCCCTCGAGCGA(T/C)AT(T/C)(C/G)(A/T)G(C/A)T(G/C) 
ACCCA(A/G)TCT 

Fvk back primer (SEQ ID NO:89): 
30 5' ACACTGCAGCAGTTGGTGCAGCATCAGC 



Linker segment (SEQ ID NO:90): 

5* CTGCAGGAAGCGGAAGCGGAGGAAGCGGAAGCGGAGGAA 
GCGGAAGCGAATTC 
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The linker segment is synthesized using a PerSeptive Biosystems 8909 
DNA Synthesizer and encodes glycine and serine residues which enable the proper 
folding of the antibody variable regions in the final protein. Sequences at the termini 
5 enable ligation into the PstI and EcoRI sites of pBluescript The linker segment is first 
annealed with the following complementary DNA prior to ligation into the vector (all 
other DNAs derived from PCR are double stranded and restricted with the appropriate 
enzyme prior to ligation). 

10 Linker complement (SEQ ID NO:91): 

5' CCTTCGCCTTCGCCTCCTTCGCCTTCGCCTCCTTCGCCTTCGCT 
TAA 



Similarly, a signal peptide segment to enable translation of the final 
1 5 protein into the endomembrane system of the insect cell is synthesized, annealed to its 
complement and ligated into the BamHl and Smal sites of pBluescript. 

Signal peptide (SEQ ID NO:92): 

5 f ACAGGATCCATGGAAACCCCAGCGCAGCTTCTCTTCCTCCTGC 
20 TACTCTGGCTCCCAGATACCACCGGACCCGGG 



The TM segment, synthesized by the phosphoramidite method as to 
contain cysteines at positions 14 and 68, also contains SacII and Spel restriction sites at 
its 5' and 3' end respectively. It is ligated directly into the p2Bac™ vector (Invitrogen). 

25 The ligation reactions are performed essentially as described in Sambrook et al. The 
other segments are first ligated into pBluescript in the following order: linker segment 
(Pstl/EcoRI), Fvk (Smal/PstI), Fvy (EcoRI/EcoRV), Ca3 (EcoRV/HindHI). The hybrid 
cDNA is excised from the bacterial vector by BamHI and Hindlll restriction enzyme 
digestion, gel purification and ligated into the p2Bac™ vector (Invitrogen) at the Bglll 

30 and Hindlll sites. After cloning, the plasmids containing cDNAs in the appropriate 
orientation are isolated and used for transformation of insect cells as described above. 
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The resulting (FvK-linker-Fvy-Ca3)2:TM (anti-HA-TM) protein 
containing two Kycc segments per TM, joined by disulfide bridges at the Cysl4 and 
Cys68 residues of TM, is purified by column chromatography essentially as described 
above. Additional amino acids are incorporated into the fusion protein at the DNA 
5 junction points as follows (the dash indicates the fusion site of the individual segments): 
Pro-Gly at the Smal site, Pro-Ala at the PstI site, Glu-Phe at the EcoRI site, and Asp-Ile 
at the EcoRV site. 

As a control FvK-linker-Fvy-Ca3 (anti-HA) is separately purified from 
insect cells which do not co-express TM. 

10 B. Preparation of Functional Genes Attached to TM 

Preparation of TM-polylysine conjugates. TM isolated from biological 
sources as described above, is covalently linked to poly (L-lysine) (Mr 20,000 D) using 
the heterobifunctional crosslinking reagent N-succinimidyl 3-(2-pyridyldithio) 
proprionate (SPDP) as previously described (Ferkol, et aL, J. Clin. Invest 92:2394- 

15 2400, 1993). After reduction of the SPDP, TM is incubated with a fifteenfold molar 
excess of poly (L-lysine)-SPDP and the reaction is carried out at 2°C for 24 hours. The 
conjugate is dialyzed to remove low molecular weight reaction products, and analyzed 
by separating the resultant proteins using 0.1% SDS-7.5% polyacrylamide gel 
electrophoresis. 

20 Reporter genes and plasmid preparation. The plasmids PRSVZ and 

PRSVCAT, containing the Escherichia coli lacZ and chloramphenicol acetyltransferase 
genes, respectively, ligated to the Rous sarcoma virus long terminal repeat promoter 
inserted into a modified pBR322 vector, are used as reporter genes. The plasmids are 
grown in E. coli DH5a, extracted and purified by standard techniques. Digestions of the 

25 plasmids with restriction endonucleases yields the appropriate fragments, and purity is 
established by 1 .0 % agarose gel electrophoresis. 

Preparation of TM-polylysine-DNA complexes. Complexes are formed 
by combining plasmid DNA with the TM-polylysine in 3M NaCL The charge ratio of 
the DNA phosphate to lysine is - 1.2:1. Samples are incubated for 60 minutes at 22°C, 
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then dialyzed against 0.15 NaCl for 16 hours through membranes with a 3,500-dalton 
molecular mass limit. The complexes are filtered through a Millipore filter with 1 5 jam 
pore size, and maintained at 4°C prior to use. 

Determination of optimal conjugate to DNA proportion. To determine 
5 the optimal proportion of conjugate to DNA, increasing amounts of the conjugate are 
added to 10 jug of PRSVZ, producing 1:4, 1:8, 1:16, and 1:32 DNA to carrier (TM) 
molar ratios. Samples are incubated as described above, and dialyzed overnight against 
0.15 M NaCl. The complexes are filtered before use. Samples containing equal 
amounts of DNA (1 jag) are separated by 1.0% agarose gel electrophoresis and stained 
10 with ethidium bromide. The plasmid DNA is transferred onto a nitrocellulose filter and 
analyzed by Southern blot hybridization, using the 2.3-kB EcoRI fragment of PRSVZ 
as a DNA probe. 



C. Preparation of an Anti-C Difficile Toxin A Attached to TM 

Cells and cultures. Cell media, culture, fusion procedures, and ascites 

15 production to obtain monoclonal antibodies (MAbs) are as described by Harlow and 
Lane, "Antibodies: A Laboratory Manual," Cold Spring Harbor Laboratory, Cold 
Spring Harbor, NY, 1988. Mice receive subcutaneously 4.5 |ig of inactivated toxin (4% 
Formalin for 1 week at 4°C) with Freund's complete adjuvant (at days -200, -190, and 
-120). On days -30 and -4, they receive by the same route 200 ng of native toxin 

20 without adjuvant. On the day of fusion, after hemisplenectomy, spleen cells are fused 
with SP20 myeloma cells. Screening procedures began 10 days later with the 
neutralization assay, enzyme immunoassay, and immunoblot procedures described 
below. Subcloning is done by the limiting dilution method, and typing of MAbs is done 
by using a mouse MAb isotyping kit (Amersham). 

25 Approximately 10% hybridomas are found to produce antibodies that 

react with toxin A by immunoblot and by ELISA. Ascites are produced with the most 
interesting clones (after the subcloning procedure) and analyzed for immunoreactivity 
with native toxin A. 
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Toxin A (partially purified after acid precipitation as described by 
Towbin et al., Proc. Natl Acad ScL USA 76:4350-4354, 1979) is neutralized by ascites 
fluid as follows. The dose of toxin A used is adjusted to cause 100% mortality within 
24 hours postinoculation (about 1 ng per mouse). The toxin is mixed with MAb ascites 
5 (final dilution 1:3), incubated for 1 hour at 37°C, and injected intraperitoneal^ into 
mice (five animals per group). Survival is determined 15 hours later. For the toxin A 
antibody ELISA, microtiter wells are coated with 0.5 |ig of native toxin A overnight in 
carbonate buffer, pH 9.6. Wells are washed, and 0.1 ml of ascites fluid is added for 1 
hour and then serially diluted. Wells are washed and goat anti-mouse IgG (whole 

10 molecule)-alkaline phosphatase conjugate (1:1,000) is added. Wells are washed and 
incubated with para-nitrophenol phosphate (Sigma 104 tablets; Sigma Chemical Co., 
St. Louis, MO) in ethanolamine buffer (pH 9.8). The absorbance at 405 nm is 
determined after 1 hour at room temperature. Wells that give an A 405 value two times 
higher than background are considered positive. 

15 Titers correspond to the log 10 of the highest dilution of sample which had 

an optical density of twice the background. Sodium dodecyl sulfate (SDS)-PAGE is 
done by the method of Laemmli. Samples are subsequently transferred to nitrocellulose 
as described by Towbin et al. and screened with 1:3 dilutions of hybridoma tissue 
culture or ascites fluids, followed by the addition of a 1 :500 dilution of goat anti-mouse 

20 IgG (whole molecule)-horseradish peroxidase conjugate. Staining is done with 
diaminobenzidene (5 mg/mL) and hydrogen peroxide. Double agar diffusion (1.59% 
agar concentration) is performed with crude C difficile supernatant containing toxin A 
(1 mg/ml) and ascites fluid (diluted 1:10). Positive reactions are observed 2 days later. 

Isolation of mRNAs and Synthesis of cDNAs. mRNA derived from cell 

25 lines producing IgG antibodies is isolated by established procedures using the 
FastTrack™ mRNA isolation kit (Invitrogen). Specific linkers are employed to prime 
polymerase chain reactions resulting in the amplification of the Fv-Cyl section, and the 
entire k chain in separate amplification reactions. 



30 Heavy chain forward primer (SEQ ID NO:93): 

5' TGGTACAGATCTAGGT(G/C)(A/C)A(A/G)CTGCAG(G/C)AGTC 
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5 



(A/G)G 

Heavy chain back primer (SEQ ID NO:94): 

5' ACAGAATTCAATTTTCTTGTCCACCTT 

The forward primer is 32-fold degenerate as indicated by the nucleotides 
in parentheses. The back primer encodes the last six amino acids of the C H 1 constant 
region of the gamma chain. The kappa chain is amplified in its entirety. 

1 0 Kappa forward primers (SEQ ID NO:95): 

5' GTTCTAGAGA(T/C)AT(T/C)(C/G)(A/T)G(C/A)T(G/C)ACCCA(A/G) 
TCT 

Kappa back primer (SEQ ID NO:96): 
15 5' ACACCGCGGCAGTTGGTGCAGCATCAGC 

A signal peptide segment enabling translation of the final protein into the 
endomembrane system of the insect cell is synthesized, annealed to its complement and 
ligated into the BamHI and Bglll sites of p2Bac™ vector (Invitrogen) heavy chain-TM 
20 expression and into to Spel and Xbal sites of p2Bac™ for expression of the kappa 
chain. 

Signal peptides: 

Heavy chain (SEQ ID NO:97) 
25 5' ACAGGATCCATGGAAACCCCAGCGCAGCTTCTCTTCCTCCTG 

CTACTCTGGCTCCCAGATACCACCGGAAGATCT 
Light chain (SEQ ID NO:98) 

5' ACAACTAGTATGGAAACCCCAGCGCAGCTTCTCTTCCTCCTG 
CTACTCTGGCTCCCAGATACCACCGGATCTAGA 
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The TM segment, synthesized by the phosphoramidite method to contain 
serines at positions 14 and 68, also contains EcoRI and Hindlll restriction sites at its 5' 
and 3' end respectively. A stop codon is included in the proper reading frame to halt 
translation of the fusion transcript. 
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The segments are ligated directly into the p2Bac vector in the following 
order for the heavy chain-TM fusion: signal sequence (BamHI-Bgll), heavy chain 
segment (BgHI-EcoRI), TM segment (EcoRI-Hindlll). The order for the kappa chain: 
signal sequence (Spel-Xbal), kappa chain (Xba-SacII). 
5 The resulting heavy chain (Fv,C H l)-TM:kappa chain hybrid protein 

(anti-C difficile-TM) joined by disulfide bridges through the constant regions of the 
heavy and light chains, is purified by column chromatography essentially as described 
above. Additional amino acids are incorporated into the fusion protein at the DNA 
junction points as follows (the dash indicates the fusion site of the individual segments): 
1 0 Arg-Ser at the Bglll site, Glu-Phe at the EcoRI site, Ser-Arg at the Xbal site. 

D. Preparation of TM with various linkers to fluorescent compounds or anticancer 
drugs. 

General method for finoc synthesis of peptide linkers. Reactions were 
generally performed at the 0.2 mmol scale and follow previously described procedures 

15 (M. Bodanszky, A. Bodanszky, The Practice of Peptide Synthesis, Springer-Verlag, 
Berlin, 1984; M. Bodanszky, Peptide Chemistry; A Practical Textbook, Springer- 
Verlag, Berlin, 1988). Coupling reactions were initiated at the carboxy terminus using a 
protected amino acid (amino acid #1) immobilized to a p-alkoxybenzyl alcohol resin 
(e.g., Fmoc-Lys(Boc)-resin ? Peninsula Laboratories (Belmont, California) product 

20 #FM058AAR, 0.2-0.5 meq/g). Protecting groups for the primary amines comprised the 
9-fluorenylmethyloxycarbonyl group, fmoc. R group protection (e.g., trityl, t-butyl, 
butoxycarbonyl, acetamidomethyl, ethylthio) depended on the nature of the R group. 
Reactions were carried out in a funnel containing a scintered glass filter (e.g., Kimax 
#28400-301) fitted with a two way stopcock. The fmoc protecting group on amino acid 

25 #1 was first removed by incubation in 20% piperidine in dimethylformamide (DMF) for 
15 minutes at room temperature. Piperidine was then washed out with excess DMF. 
Fmoc protected amino acid #2(1 mmol) dissolved in minimal DMF (~1 ml) was added 
to the resin followed by the addition of 1 mmol hydroxybenzotriazole also dissolved in 
minimal DMF. Coupling was initiated by the addition of 1 mmol 
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diisopropylcarbodiimide. The reaction was allowed to proceed at room temperature 
with gentle shaking for 1 hour. The resin was then washed with excess DMF to remove 
all reagents. The efficiency of the reaction was monitored using a standard ninhydrin 
assay (Pierce product #21205). The procedures were then repeated {i.e., deprotect, 
5 wash, couple, wash) for the addition of each amino acid comprising the desired 
sequence. The final peptide was removed from the resin by incubation at room 
temperature for 1-3 hours in 95% TFA containing water and scavengers (e.g., 
triisoproylsilane, ethanedithiol, thioanisole, bromotrimethylsilane). This procedure 
removes all R-group protection as well. Peptide was precipitated from the TFA 

10 solution by the addition of 4 volumes of diethyl ether, the peptide pellet was redissolved 
in DMF, and purified by reverse phase liquid chromatography. 

Fluorescent compound with a scissile linker attachment to synthetic TM. 
The polyimmunoglobulin receptor sequence from residues 585-600 
(AIQDPRLFAEEKAVAD; SEQ ID NO:45), which is the substrate for an intracellular 

15 processing protease, is synthesized by peptide coupling as described above. The 
peptide is synthesized from a Gly-thioester resin support yielding a C terminal Gly- 
aCOSH after cleavage. Prior to release from the column, the amino terminus of the 
peptide is reacted with NHS-fluorescein (1 mmol dissolved in 1 ml DMF) (Pierce 
product #46100). The peptide is then released from the column to yield a 

20 fluoresceinated amino terminus and a reactive thioester group at the carboxy end. The 
fluoresceinated peptide (10 jumol) is attached to TM (1 jumol) by reaction of the 
peptidyl thioester group with bromoacteyl group at residue 1 of TM (structure E #2, 
Table II). The derivatized TM is then purified from the reaction mixture by column 
chromatography (NAP- 10 column, Pharmacia). This compound is referred to as TM- 

25 peptide-FL. Control preparations are performed in identical fashion except the 
synthetic peptide linker has no cleavage site: VAVQSAGTPASGS (SEQ ID NO:99). 

Fluorescent compound with a scissile linker attachment to purified 
dimeric IgA The peptide was synthesized with an additional cysteine residue at the C 
terminus to yield the sequence AIQDPRLFAEEKAVADC (SEQ ID NO:45). Prior to 

30 release from the column, the amino terminus of the peptide is reacted with NHS- 
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fluorescein (1 mmol dissolved in 1 ml DMF) (Pierce product #46100). The peptide is 
then released from the column to yield a fluoresceinated amino terminus and a reactive 
sulfhydryl group at the carboxy end. Dimeric IgA (100 nmol) purified from biological 
sources as described above is reacted with sulfosuccinimidyl 4-[N- 
5 maleimidomethyl]cyclohexane-l -carboxy late (sulfo-SMCC, 10 jamol, Pierce product 
#22322) according to the manufacturers protocol. The compound reacts with free 
amino groups via the sulfosuccinimidyl moiety and thereby attaches a reactive 
maleimide group for reaction with free sulfhydryls. The dlgA-SMCC derivative is 
purified from the reaction mixture by column chromatography in 25 mM phosphate 

10 buffer, pH 6.8, containing 1 mM EDTA (NAP-10 column, Pharmacia). The purified 
dig A in -1 ml buffer is immediately reacted with the fluoresceinated peptide containing 
a free sulfhydryl group (10 jamol dissolved in 200 ]xl DMF) for 12 hours at 4°C. The 
derivatized dlgA is then purified from the reaction mixture by column chromatography 
(NAP-10 column, Pharmacia). This compound is referred to as dlgA-peptide-FL. 

15 Control preparations are performed in identical fashion except the synthetic peptide 
linker has no cleavage site: VAVQSAGTPASGS (SEQ ID NO:99). 

Anti-cancer drug attached to TM via a scissile peptide and a pH- 
sensitive hydrazide linker, 3-deamino-3-(4-morpholinyl)-doxorubicin (MRA) is 
prepared from doxorubicin (Aldrich, Milwaukee, Wisconsin) by reacting via 

20 dialdehyde, followed by a reaction with sodium cyanoborohydrate as previously 
described (Mueller et al. ? Antibody, Immunoconjugates, and Radiopharmaceuticals 
4;99-106, 1991). MRA is purified after separation on a silica gel column, and is 
modified with a peptide spacer by the following procedure. First, the peptide PLGIIGG 
(SEQ ID NO: 109) is esterified to yield the corresponding methyl ester. This is followed 

25 by condensation of the amino terminal of the peptide with succinic anhydride, followed 
by reaction of the ester terminal with hydrazine hydrate to yield the monohydrazide. 
The hydrazide moiety of this activated peptide is then reacted via the C-13 carbonyl 
group of MRA to yield MRA-PLGIIGG (SEQ ID NO: 109), which is purified by 
preparative thin layer chromatography (TLC). The purified drug-linker intermediate is 

30 reacted at the succinic acid terminal with dicyclohexyl carbodiimide (DCC) and N- 
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hydroxysuccinimide (NHS). This activated compound is again purified by TLC and 
then coupled to the lysine residues of TM by adding a 20-fold excess of MRA- 
PLGIIGG (SEQ ID NO: 109) to purified TM at pH 8 for 3 hr. The TM used in this 
preparation is isolated from biological sources as described above. This conjugate is 
5 referred to as TM(bio)-MRA. 

The conjugation reaction mixture is centrifuged to remove precipitated 
material and is applied to a column of Sephadex G-50 equilibrated with 50 mM sodium 
phosphate, 0.1 M NaCl (pH 7.0). The fractions containing TM(bio)-MRA conjugate are 
pooled and stored at 4°C. The drug-to-TM ratio is determined by spectrophotometry at 
10 280 and 480 nm using extinction coefficients of 9.9 mM -1 cm -1 and 13 mM -1 cm" 1 , 
respectively. The conjugates are analyzed by HPLC on a Dupont GF-250 gel filtration 
column and by NaDodS04/PAGE on 7.5% acrylamide gels under nonreducing 
conditions. 

Anti-cancer drug attached to dimeric IgA via a scissile peptide and a 

15 pH-sensitive hydrazide linker. The activated drug linker compound, prepared as 
described above, is coupled to the lysine residues of dimeric IgA by adding a 20-fold 
excess of MRA-PLGIIGG (SEQ ID NO: 109) to purified dlgA at pH 8 for 3 hr. The 
dlgA used in this preparation is isolated from biological sources as described above. 
This conjugate is referred to as dlgA-MRA. 

20 The conjugation reaction mixture is centrifuged to remove precipitated 

material and is applied to a column of Sephadex G-50 equilibrated with 50 mM sodium 
phosphate, 0.1 M NaCl (pH 7.0). The fractions containing dlgA-PLGIIGG-MRA (SEQ 
ID NO: 109) conjugate are pooled and stored at 4°C. The drug-to-dlgA ratio is 
determined by spectrophotometry at 280 and 480 nm using extinction coefficients of 9.9 

25 mM" 1 cm -1 and 13 mM" 1 cm* 1 , respectively. The conjugates are analyzed by HPLC on 
a Dupont GF-250 gel filtration column and by NaDodS0 4 /PAGE on 7.5% acrylamide 
gels under nonreducing conditions. 

Fluorescent compound targeted for retention in the endoplasmic 
reticulum. The scissile peptide AIQDPRLFAEEKAVAD (SEQ ID NO:45) is prepared 

30 as described above to contain an amino terminal fluorescein and a free sulfhydryl from 
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an additional cysteine at the carboxy terminal. TM (100 nmol) purified from transgenic 
insect cells as described above is reacted with sulfosuccinimidyl 4-[N- 
maleimidomethyl]cyclohexane-l -carboxy late (sulfo-SMCC, 10 jj,mol ? Pierce product 
#22322) and purified as described above. The purified TM-SMCC in -1 ml buffer is 
5 immediately reacted with the fluoresceinated peptide containing a free sulfhydryl group 
(10 jo,mol dissolved in 200 jal DMF) as described above. The derivatized TM is then 
purified from the reaction mixture by column chromatography (NAP- 10 column, 
Pharmacia). The ER retention signal KDEL is synthesized as part of the TM core 
protein by phosphoramidite oligonucleotide coupling as described above and ligated 

10 into an insect expression vector to create pTM. The final compound is referred to as 
TM(kdel)-peptide-FL. 

Anti-cancer drug targeted for retention in the endoplasmic reticulum. 
The activated drug linker compound, prepared as described above, is coupled to the 
lysine residues of TM by adding a 20-fold excess of MRA-PLGIIGG (SEQ ID NO: 109) 

15 and purified as described aove. The TM used in this preparation is isolated from 
transgenic insect cells. The ER retention signal KDEL is synthesized as part of the TM 
core gene by phosphoramidite oligonucleotide coupling as described above and ligated 
into an insect expression vector to create pTM. This conjugate is referred to as 
TM(KDEL)-MRA. 

20 Fluorescent compound targeted to the nucleus. Two nuclear targeting 

sequences AAPKKKRKV (SEQ ID NO: 100) and AAKRPAAIKKAGQAKKKK (SEQ 
ID NO: 101) are synthesized with amino terminal fluorescein and an additional carboxy 
terminal cysteine as described above. TM (100 nmol) purified biological sources as 
described above is reacted with sulfo-SMCC and purified as described above. The 

25 purified TM-SMCC in -1 ml buffer is immediately reacted with the fluoresceinated 
peptide containing a free sulfhydryl group (10 |amol dissolved in 200 jal DMF) as 
described above. The derivatized TM is then purified from the reaction mixture by 
column chromatography (NAP- 10 column, Pharmacia). The final compound is referred 
to as TM-peptide(nuc)-FL. Control preparations are performed in identical fashion 
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except the synthetic peptide linker has no targeting function: VAVQSAGTPASGS 
(SEQ ID NO:99). 

Anti-cancer drug tethered to an antigen combining site. The linker 
peptide PLGIIGG (SEQ ID NO: 109) is first coupled to MRA via the hydrazide as 
5 described above. In this procedure however the succinic anhydride step is omitted, 
yielding a peptide-MRA containing a free amino terminus. The purified drug-linker 
intermediate is reacted at the amino terminal with dicyclohexyl carbodiimide (DCC) 
and N-hydroxysuccinimide (NHS) and a 20-fold excess of diketone 1 (Wagner et al., 
Science 270:1797-1800, 1995). The 1,3-diketone l is synthesized as described in 
10 Wagner et al. 

The diketone-peptide-MRA conjugate is reacted with the antigen 
combining site of antibody 38C2 (Wagner et al.) engineered to be covalently linked to 
TM. The engineering procedures to produce TM-38C2 are essentially as described 
above in Example 2C. mRNA derived from a cell line producing 38C2 antibody is 
15 isolated by established procedures. Specific linkers are employed to prime polymerase 
chain reactions resulting in amplification of the Fv-Cyl section, and the entire kappa 
chain in separate amplification reactions as described above. 

The resulting heavy chain (Fv-C H 1)-TM: kappa hybrid antibody joined by 
disulfide bridges through the constant regions of heavy and light chains is purified as 
20 described above. 

Reaction of the hybrid antibody with the diketone-peptide-MRA results 
in a stable vinylogous amide linkage between the diketone moiety and the epsilon 
amino group of a lysine residue in the binding pocket. The final compound is referred 
to as TM(38C2)-MRA. 

25 Intestinal trefoil factor attached to TM via a carbohydrate linker. The 

porcine intestinal trefoil factor (ITF) is purified using a specific antibody as described 
(Suemori et al., Proa Natl Acad. Sci. USA 88:1 1017-1 1021, 1991). TM, synthesized 
as described above by peptide coupling and corresponding to the structure described in 
Table II E. #2 is linked to the enterokinase recognition sequence, (Asp)4-Lys, by 

30 procedures described above. The recognition sequence is synthesized from a Gly- 
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thioester resin support yielding a C terminal Gly-aCOSH after cleavage. The sequence 
is further modified to contain an amino terminal cysteine. The released peptide is 
coupled to TM by reaction of the thioester and the bromoacetyl functional groups. ITF 
is then derivatized to be reactive with sulfhydryl groups by reaction with sulfo-SMCC 
5 as described above. After purification, ITF-SMCC is coupled to the (Asp)4-Lys-TM 
and purified as described above. The reaction results in coupling of ITF to TM via a 
peptide linker which is a substrate for enterokinse associated with the apical surface of 
the intestinal epithelial barrier. The compound is referred to as TM-ITF. 

10 Example 3 

Intracellular Delivery Of A Biological Agent 

This example illustrates the use of a TM prepared as described in 
Example 2 for delivery of biological agents to epithelial cells. 

15 A. Intracellular Colocalization of TM and HA Viral Protein and Neutralization of 
Virus 

Intracellular Co-localization of TM and HA. MDCK cells stably 
transfected with cDNA encoding the rabbit plgR are cultured on nitrocellulose filters in 
microwell chambers (Millicell, Millipore, Bedford, Massachusetts). Confluent pIgR + 

20 MDCK cell monolayer filters are infected with influenza virus (1 PFU per cell) via the 
apical surface for 60 minutes at 37°C. After 8 hours, equivalent ELISA titers of either 
anti-HA-TM or anti-HA is added to the lower compartment. Twenty-four hours after 
the addition of antibody, cells are detached with trypsin (0.25% in 0.02% EDTA) (JRH 
Biosciences, Lenexa, Kansas), cytocentrifuged onto glass slides, and fixed with acetone. 

25 Two-color immunofluorescence is used to detect HA glycoprotein and Ca3 
simultaneously. The slides are incubated with fluorescein-labeled goat anti-murine IgA 
(Southern Biotechnology Associates, Inc., Birmingham, Alabama) and after extensive 
washing with PBS, biotin-labeled murine IgG anti-HA-MAb (directed against a 
different epitope from the anti-HA and anti-HA-TM antibody added to the cells in 
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culture) in 1% bovine serum albumin in phosphate-buffered saline (PBS) is added for 1 
hour at room temperature. After the slides are washed in PBS, HA protein is detected 
with Texas Red-conjugated streptavidin (Fisher Biotech, Pittsburgh, Pennsylvania). 

Anti-HA-TM colocalizes with HA viral proteins as documented by two- 
5 color immunofluorescence by which identical microscopic fields are viewed through 
separate filters that discriminated the appropriate wavelengths. Compartments 
containing anti-HA-TM are green, while those containing HA proteins are red. In 
double exposures, cellular sites in which both anti-HA-TM and HA proteins are present 
appear yellow. These observations are consistent with the hypothesis that during 

10 epithelial transcytosis, specific anti-HA-TM antibody can interact with newly 
synthesized viral HA protein. It contrast, infected monolayers treated with specific anti- 
HA containing no TM do not demonstrate intracellular antibody localization since IgG 
sequences are not transported through the epithelium. Influenza infected cells treated 
with irrelevant IgAs, including IgA anti-Sendai virus HN and IgA anti-dinitrophenol, do 

1 5 not stain for the presence of antibody, indicating that accumulation of intracellular anti- 
HA-TM is due to combination with viral protein and not a result of nonspecific 
interference of IgA transport by the viral infection. In addition, uninfected monolayers 
treated with specific anti-HA without TM do not demonstrate intracellular aggregation 
of antibody. Collectively, these studies document that in cells infected with virus, 

20 transport of specific anti-HA-TM but not irrelevant IgA or anti-HA without TM, is 
impeded, resulting in intracellular accumulation only of specific anti-HA-TM. 

Neutralization of Virus. The following experiments demonstrate that 
anti-HA-TM can interact with intracellular HA proteins within infected epithelial cells 
in such a manner as to reduce viral titers. Confluent MDCK cells expressing the plgR 

25 are infected with influenza virus as described above. Six hours later, equivalent ELISA 
titers of anti-HA, anti-HA-TM, or MOPC-315, an irrelevant murine IgA, or anti-Sendai 
virus HN MAbs was added to the lower chamber. In some experiments, anti-murine 
IgA, in an amount that is predetermined to effectively inhibit specific IgA from binding 
to and neutralizing virus as documented in ELISA and plaque reduction assays, was 

30 added to the apical chamber of some groups. After an additional 4 hours, the specific 



75 



IgA was removed from the basal chamber and the basal surface of the cell layer is 
washed. Monolayers are then incubated for an additional 24 hours at 37°C, at which 
time the apical supernatants are removed. Cells are scraped off the filters into PBS and 
disrupted by three successive freeze-thaw cycles. Cellular debris is removed from the 
5 lysate by centrifugation. The apical supernatants and cell lysates are tested for virus by 
plaque assay in which samples are pretreated with 5 j^g of trypsin (Gibco, Grand Island, 
New York) to activate virus. Comparisons among groups in each experiment are made 
by one-way analysis of variance with Fisher's protected t test. 

Mean virus titers are significantly reduced in both the supernatants and 

10 cell lysates of polarized epithelial monolayers treated with anti-HA-TM compared with 
those from monolayers receiving anti-HA without TM. IgA anti-Sendai virus HN does 
not reduce influenza virus titers nor does an irrelevant IgA, MOPC-315. In addition, 
high titers of anti-IgA added to the apical surface of the cells does not reduce the ability 
of anti-HA-TM to neutralize the virus demonstrating that the neutralization is occurring 

15 inside the epithelial cell and is not the result of anti-HA-TM accumulating in the apical 
supernatant. 

B. Delivery of Genes to Epithelial Cells Using TM-Polylysine 

Cells and cell culture. Human colonic carcinoma (HT29) cells are 
cultured as described by Chintalacharuvu et al., J. Cell Physiol 745:35-47, 1991, and 

20 maintained in RPMI Media 1640. Human tracheal epithelial cells are harvested from 
necropsy specimens less than 24 hours postmortem and cultured as described by Ferkol 
et al, J. Clin. Invest. 92:2394-2400, 1993. Cells are grown on collagen gel matrices or 
on uncoated plates. Transfections are performed when the cells are 50 to 95% 
confluent. Viability of cells is determined by trypan blue exclusion. 

25 DNA delivery to cells. Four days before transfection, the HT29 cells are 

washed twice with PBS, pH 7.4. Half of the cells are returned to RPMI Media 1640, 
and the remaining half are grown in Leibovitz LI 5 Media, a glucose-deficient culture 
medium. Human gamma interferon, 100 U/ml, is added to half of the cells grown in 
glucose-deficient media 2 days before transfection. Transfer of HT29 cells to glucose- 
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free media increases expression of plgR, as does treatment with human gamma 
interferon. Cell density is approximately 5 x 10^ cells per plate at the time of 
transfection. Growth medium is changed and the cells are washed with PBS. Solutions 
containing TM-polylysine-DNA complex (2.5 pmol DNA noncovalently bound to 10, 
5 20, 40, or 80 pmol TM), polylysine-DNA complex (2.5 pmol DNA complexed with 1.2 
nmol polylysine), TM-polylysine (80 pmol) alone, or 2.5 pmol (20 (ag) DNA alone, are 
added to individual plates. Each sample is filtered prior to transfection of cells. After 
the cells are incubated for 48 hours at 37°C, either in vitro or in situ, p-galactosidase 
assays are performed. 

10 When primary cultures of human tracheal epithelial cells are 50% 

confluent, cells are washed once with PBS, pH 7.4, and the media is changed 
immediately before transfection. The conjugate-DNA complex, containing 10 |LXg (-1.3 
pmol) plasmid, is applied and permitted to remain on the cells for 48 hours. The cells 
are then either harvested for protein extraction or fixed for in situ P-galactosidase 

1 5 assays. 

Assays for fi-galactosidase activity. The cells are washed in cold 
phosphate buffer once, then scraped from the plate in a solution consisting of 10 mM 
Tris, pH 7.5, 150 mM NaCl, and 1 mM EDTA. Centrifuged at 10,000 rpm for 1 
minute, the cell pellets are resuspended in 100 jal 250 mM Tris, pH 7.8, and lysed by 

20 repeated freezing and thawing. Aliquots of the supernatant are assayed for protein 
content, and samples of supernatants containing equal amounts of protein are incubated 
at 37°C for 12 hours with 520 mg ONPG as described by Lim and Chase, 
BioTechniques 7:576, 1989. The optical density of the samples is measured at 420 nm. 

Individual cells expressing p-galactosidase are also identified following 

25 incubation with X-gal as described by Lim and Chase. Briefly, the cells are fixed with a 
solution of 1% glutaraldehyde in PBS for 15 minutes, and then incubated with a 
solution containing 0.5% X-gal for 12 to 16 hours at either 22 or 37°C. Blue colored 
cells are identified by phase-contrast light microscopy. A minimum of 100 cells are 
counted to determine the percentage of cells expressing P-galactosidase. 
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Immunohistochemical staining of cells for plgR- The expression of plgR 
in human tracheal epithelial cells transfected with the plasmid PRSVZ is determined by 
indirect immunofluorescence. After in situ P-galactosidase staining, the cells are fixed 
with a solution containing 2% paraformaldehyde, 10 mM NaI04, 37 mM Na2HP04, 
5 and 75 mm lysine, pH 6.2, for 2 hours. The cells are made permeable by treatment with 
PBS containing 0.1% (w/v) ovalbumin and 0..5% saponin, then incubated sequentially 
with rabbit anti-human SC and fluorescein conjugated goat anti-rabbit IgG. Both 
antibodies are diluted 1:100 in PBS containing 0.1% (w/v) ovalbumin and 0.5% 
saponin. Between each incubation, the cells are washed three times with PBS 
10 containing 0.1% (w/v) ovalbumin. The stained cells are examined by fluorescence 
microscopy. 

Expression of fi-galactosidase in epithelial cells. Immunohistochemical 
evaluation and measurement of P-galactosidase activity is used to assess delivery of 
functional vector sequences to epithelial cells. The percentage of cells expressing 

15 p-galactosidase is comparable to the percentage of cells that express plgR. 

In general, the results demonstrate that expression plasmids 
noncovalently bound to TM-polylysine can be introduced efficiently into epithelial 
cells. Delivery is specific for cells that express plgR, since human tracheal epithelial 
cells grown on plastic, a condition that down-regulates the expression of the receptor, 

20 fail to express the reporter gene whereas cells from the same trachea maintained on 
collagen gels can be transfected. The transfection of HT29 cells is also dependent on 
the level of expression of plgR, since cells grown in conditions that up-regulate the 
receptor express the reporter gene more than cells grown in undifferentiated conditions. 
Competition for the plgR with dimeric IgA in a fourfold molar excess blocks the 

25 delivery of the complex, indicating that the binding site(s) on the plgR for dimeric IgA 
and TM-polylysine overlap. Uptake is not due to a non-specific increase in pinocytosis 
secondary to the presence of the TM-polylysine in the culture medium since the 
addition of TM-polylysine with uncomplexed DNA or the carrier-DNA complex after 
dissociation with DTT does not result in an increase in reporter gene expression. 

30 Moreover, the use of complexes with Fab fragments from irrelevant antibodies does not 
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permit the uptake and expression of the reporter gene. Thus, the uptake and expression 
of the reporter gene is mediated by the specific interaction of the TM-polylysine with 
plgR. 



C. Protection Against Pseudomembranous Colitis in Mice Using Anti-C. Difficile- 
5 TM. 

C difficile strain VPI 10463 (referred to as VPI) is grown in brain-heart 
infusion (BHI) broth (Difco Laboratories, Detroit, Michigan). For plate counts, samples 
are homogenized, serially diluted, and plated on BHI agar. Colony counting is 
performed after incubation at 37°C for 2 days. The toxin A preparation is obtained by 
10 using C. difficile grown within a dialysis bag in flasks containing autoclaved BHI. 
Flasks are incubated for 4 days at 37°C in an anaerobic chamber. Toxin A is purified as 
described previously. The purified toxin gives a single band in polyacrylamide gel 
electrophoresis and Western immunoblot analysis and has a molecular mass of 400 
kDa. 

15 C3He/J axenic adult mice are reared in a Trexler-type isolator fitted with 

a rapid transfer system (La Calhone, Vdlizy, France) and fed a rodent diet (RO340, 
UAR, Villemoisson, France) ad libitum. All materials used for the mice are sterilized 
by heat or gamma irradiation. 

Pseudomembranous cecitis is induced as follows. Axenic mice are 

20 inoculated through the orogastric route with 1 ml of a 24-hour culture of C difficile VPI 
(ca. lO^ vegetative cells per ml). Under these conditions, mice developed a disease 
characterized by an intense cecal abrasion together with a severe inflammatory process. 
All the animals die within 2 days. For passive protection studies, ascites fluids diluted 
1:3 are injected intravenously (0.2 ml at the eye orbital sinus) into axenic mice. Three 

25 days later, serum samples are collected, and mice are challenged with toxinogenic C. 
difficile on day 4. Mortality is determined 2 days later. Surviving mice are killed on 
day 8 (4 days following challenge with the organism). Each cecum is weighed and 
homogenized in phosphate-buffered saline. Bacterial cells are counted, and supernatant 
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fluids are analyzed for toxins A and B. Levels of serum antibodies to toxin A are 
estimated. 

Ascites fluids are injected intravenously into axenic mice, and the 
stability of the antibodies in serum is examined by ELISA. Antibody titers remained 
5 high for at least 8 days, and the levels on day 8 are similar to those observed on day 3. 
Four days after the administration of ascites, mice are challenged with C. difficile VPI. 
The results show that mice injected with anti-C. diff-TM are protected against the 
disease (no mortality or diarrhea is observed). Analysis of fecal specimens showed that 
the protected mice contained similar levels of vegetative cells, toxin A, and toxin B. 
1 0 Toxin A levels are reduced in mice protected by anti-C. diff-TM compared with toxin A 
levels in dying untreated mice. 

D. Delivery of Drugs and Fluorescent Compounds Attached to TM with Linkers 

Delivery of a fluorescent compound attached to TM. Confluent pIgR + 
MDCK cell monolayer filters are incubated at the basolateral surface for twenty-four 

15 hours with TM-peptide-FL prepared as described above. Cells are then detached with 
trypsin (0.25% in 0.02% EDTA) (JRH Biosciences, Lenexa, Kansas), cytocentrifuged 
onto glass slides, and fixed with acetone. Fluorescence microscopy (491 nm excitation, 
518 nm emission wavelengths) is used to detect the presence of fluorescein. Cells 
incubated with TM-peptide-FL yielded a detectable level of fluorescence whereas the 

20 control construct, containing a non-scissile peptide, had no detectable fluorescence. 

Delivery of a fluorescent compound attached to dimeric IgA. Confluent 
plgR 4 " MDCK cell monolayer filters are incubated at the basolateral surface for twenty- 
four hours with dlgA-peptide-FL prepared as described above. Cells are then detached 
with trypsin (0.25% in 0.02% EDTA) (JRH Biosciences, Lenexa, Kansas), 

25 cytocentrifuged onto glass slides, and fixed with acetone. Fluorescence microscopy 
(491 nm excitation, 518 nm emission wavelengths) is used to detect the presence of 
fluorescein. Cells incubated with dlgA-peptide-FL yielded a detectable level of 
fluorescence whereas the control construct, containing a non-scissile peptide, had no 
detectable fluorescence. 
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Delivery to tumors of an anti-cancer drug linked to TM. The human 
colon carcinoma cell line HT-29 (expressing plgR at its basolateral surface) is grown in 
RPMI tissue culture media supplemented with 10% fetal bovine serum (FBS). In vitro 
cell lines are used in establishing xenografts in nude mice. Eight to ten week old female 
5 athymic (nu/nu) mice (National Cancer Institute, Bethesda, Maryland) are injected 
subcutaneously into the flank with cell suspensions taken from in vitro cultures. Each 
mouse receives a single injection of 2 X 10 6 cells to generate solid tumors. Tumor 
growth is followed by measurements in two perpendicular diameters. Measurements 
are made at periodic intervals to establish tumor growth time curves until animal death. 

1 0 Starting on day 3 after tumor inoculation groups of mice are treated with TM(bio)-MRA 
(prepared as described above; 100 \xg in 200 jaL sterile saline) by intraperitoneal 
injection. Control mice are treated with TM containing no doxorubicin. 

Mice treated with TM(bio)-MRA showed a significant level of tumor 
suppression compared to the controls. 

1 5 Delivery to tumors of an anti-cancer drug linked to dimeric IgA. Tumors 

are initiated as described above and growth is followed by measurements in two 
perpendicular diameters. Measurements are made at periodic intervals to establish 
tumor growth time curves until animal death. Starting on day 3 after tumor inoculation 
groups of mice are treated with dlgA-MRA (prepared as described above; 300 jag in 200 

20 )LiL sterile saline) by intraperitoneal injection. Control mice are treated with TM 
containing no doxorubicin. 

Mice treated with dlgA-MRA showed a significant level of tumor 
suppression compared to the controls. 

Delivery to tumors of an anti-cancer drug linked to the antigen 

25 combining site of a hybrid antibody. Tumors are initiated as described above and 
growth is followed in two perpendicular diameters. Measurements are made at periodic 
intervals to establish tumor growth time curves until animal death. Starting on day 3 
after tumor inoculation, groups of mice are treated with TM(382C2)-MRA (prepared as 
described above; 300 [ig in 200 fiL sterile saline) by intraperitoneal injection. Control 

30 mice are treated with TM(38C2)-MRA containing a non-scissile peptide 
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(VAVQSAGTPASGS) (SEQ ID NO:99). Mice treated with TM(38C2)-MRA showed a 
significant level of tumor suppression compared to control mice. 

Delivery of a fluorescent compound targeted for retention in the 
endoplasmic reticulum. Confluent pIgR + MDCK cell monolayer filters are incubated at 
5 the basolateral surface for twenty-four hours with TM(kdel)-peptide-FL prepared as 
described above. Cells are then detached with trypsin (0.25% in 0.02% EDTA) (JRH 
Biosciences, Lenexa, Kansas), cytocentrifuged onto glass slides, and fixed with acetone. 
Fluorescence microscopy (491 nm excitation, 518 nm emission wavelengths) is used to 
detect the presence of fluorescein. Cells incubated with TM(kdel)-peptide-FL yielded a 

10 detectable level of fluorescence whereas the control construct, containing a non-scissile 
peptide, had no detectable fluorescence. Fluorescence is further localized to 
intracellular structures consistent with endomembrane organelles. 

Delivery to tumors of anti-cancer drug targeted for retention in the 
endoplasmic reticulum. Tumors are initiated as described above and growth is followed 

15 by measurements in two perpendicular diameters. Measurements are made at periodic 
intervals to establish tumor growth time curves until animal death. Starting on day 3 
after tumor inoculation groups of mice are treated with TM(KDEL)-MRA (prepared as 
described above; 300 \xg in 200 \\L sterile saline) by intraperitoneal injection. Control 
mice are treated with TM containing no doxorubicin. 

20 Mice treated with TM(KDEL)-MRA showed a significant level of tumor 

suppression compared to the controls. 

Delivery of a fluorescent compound to nuclei. MDCK cells stably 
transfected with cDNA encoding the rabbit plgR are cultured on nitrocellulose filters in 
microwell chambers (Millicell, Millipore, Bedford, Massachusetts). Confluent pIgR + 

25 MDCK cell monolayer filters are incubated with TM-peptide(nuc)-FL containing 
nuclear targeting sequences or the control TM-peptide-TR with no sequences, via the 
lower compartment. Twenty-four hours after the addition of TM, cells are detached 
with trypsin (0.25% in 0.02% EDTA) (JRH Biosciences, Lenexa, Kansas), 
cytocentrifuged onto glass slides, and fixed with acetone. Immunofluorescence is used 

30 to detect Texas Red. 
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TM-peptide(nuc)-FL localizes nuclei as documented by 
immunofluorescence. These observations indicate that during epithelial transcytosis, 
specific TM-peptide(nuc)-FL antibody can interact with cytoplasmic or endomembrane 
receptors and undergo transport to the nucleus. In contrast, infected monolayers treated 
5 with TM containing no nuclear targeting signal do not demonstrate nuclear fluorescence 
localization. These studies document that MDCK cells transport specific TM- 
peptide(nuc)-TR containing nuclear targeting sequences to the nucleus. 

Delivery of the intestinal trefoil factor attached to TM via the 
enterokinase recognition sequence to the intestinal mucosa. Mice lacking intestinal 
10 trefoil factor are produced by targeted gene disruption as described (Mashimo et al., 
Science 274:262-265, 1996). To elicit mild colonic epithelial injury with ulceration, 
mice are given dextran sulfate sodium (DSS, 2.5% w/v) in their drinking water. After 1 
day, mice are given a daily injection of 50 ^ig of TM-ITF, prepared as described above, 
by tail vein injection. 

15 At nine days after the beginning of the DSS regimen, 50% of control 

mice develop bloody diarrhea and die. In contrast, only 5% of the TM-ITF treated mice 
develop bloody diarrhea. Inspection of the colons of control mice after DSS treatment 
demonstrates the presence of multiple stages of obvious ulceration and hemorrhage. In 
contrast, the colons of most of the TM-ITF treated mice are indistinguishable from mice 

20 receiving no DSS. 

From the foregoing, it will be appreciated that, although specific 
embodiments of the invention have been described herein for the purpose of illustration, 
various modifications may be made without deviating from the spirit and scope of the 
invention. 

25 

Summary of Sequence Listing 

SEQ ID NO:l is amino acid sequence of human J chain 
SEQ ID NO:2 is amino acid sequence of mouse J chain 
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SEQ ID NO: 3 is amino acid sequence of rabbit J chain 

SEQ ID NO:4 is amino acid sequence of bovine J chain 

SEQ ID NO: 5 is amino acid sequence of bull frog J chain 

SEQ ID NO: 6 is amino acid sequence of earth worm J chain 

SEQ ID NO:7 is nucleotide sequence of "full length" TM cDNA (Table III) 

SEQ ID NO:8 is nucleotide sequence of Core TM cDNA (Table IX) 

SEQ ID NO: 9 is nucleotide sequence of C2 fragment (Table V) 



SEQ ID NO: 10 
SEQ ID NO: 11 

10 SEQ ID NO: 12 
SEQ ID NO: 13 
SEQ ID NO: 14 
SEQ ID NO: 15 
SEQ ID NO: 16 

15 SEQIDNO:17 
SEQ ID NO: 18 
SEQ ID NO: 19 
SEQ ID NO:20 
SEQ ID NO:21 

20 SEQ ID NO:22 
SEQ ID NO:23 
SEQ ID NO:24 
SEQ ID NO:25 
SEQ ID NO:26 

25 SEQ ID NO:27 



s nucleotide sequence of D 1.1 fragment (Table VI) 

s nucleotide sequence of L3D fragment (Table VII) 

s nucleotide sequence of T4 fragment (Table VIII) 

s nucleotide sequence of Core TM cDNA using L3 (Table X) 

s nucleotide sequence of L3 fragment (Table VILA) 

s nucleotide sequence of Dl fragment (Table VI. A) 

s nucleotide sequence of TpS2 (Table XI) 

s amino acid sequence of "full length" TM cDNA (Table III) 

s amino acid sequence of Core TM cDNA (Table IX) 

s amino acid sequence of C2 fragment (Table V) 

s amino acid sequence of D 1.1 fragment (Table VI) 

s amino acid sequence of L3D fragment (Table VII) 

s amino acid sequence of T4 fragment (Table VIII) 

s amino acid sequence of Core TM cDNA using L3 (Table X) 

s amino acid sequence of L3 fragment (Table VILA) 

s amino acid sequence of Dl fragment (Table VI. A) 

s amino acid sequence of TpS2 (Table XI) 

s complementary nucleotide sequence of "full length" TM cDNA 



(Table III) 

SEQ ID NO:28 is complementary nucleotide sequence of Core TM cDNA (Table IX) 
SEQ ID NO:29 is complementary nucleotide sequence of C2 fragment (Table V) 
SEQ ID NO:30 is complementary nucleotide sequence of D 1.1 fragment (Table VI) 
30 SEQ ID NO:31 is complementary nucleotide sequence of L3D fragment (Table VII) 
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SEQ ID NO:32 is complementary nucleotide sequence of T4 fragment (Table VIII) 
SEQ ID NO:33 is complementary nucleotide sequence of Core TM cDNA using L3 
(Table X) 

SEQ ID NO:34 is complementary nucleotide sequence of L3 fragment (Table VILA) 
5 SEQ ID NO:35 is complementary nucleotide sequence of Dl fragment (Table VI.A) 

SEQ ID NO:36 is complementary nucleotide sequence of TpS2 (Table XI) 

SEQ ID NO:37 is Domain 1,13 amino acid peptide with substantial P-sheet character 

SEQ ID NO:38 is peptide recognized by the tobacco etch virus protease Nia 

SEQ ID NO:39 is amino acid residues from pro-cathepsin E 
1 0 SEQ ID NO:40 is linker from procathepsin 

SEQ ID NO:41 is linker from polyimmunoglobulin receptor 

SEQ ID NO:42 is nucleotide sequence of secretion signal from pMelBac 

SEQ ID NO:43 is amino acid sequence of secretion signal from pMelBac 

SEQ ID NO:44 is endomembrane retention signal 
15 SEQ ID NO:45 is residues 585-600 of polyimmunoglobulin receptor (human) 

SEQ ID NO:46 is Oligonucleotide 1 

SEQ ID NO:47 is Oligonucleotide 2 

SEQ ID NO:48 is Oligonucleotide 1.1 

SEQ ID NO:49 is Oligonucleotide 2.1 
20 SEQ ID NO:50 is Oligonucleotide 1 ,2ser 

SEQ ID NO:51 is Oligonucleotide 2.2ser 

SEQ ID NO:52 is Oligonucleotide 1.2val 

SEQ ID NO:53 is Oligonucleotide 2.2val 

SEQ ID NO:54 is Oligonucleotide 3 
25 SEQ ID NO:55 is Oligonucleotide 4 

SEQ ID NO:56 is Oligonucleotide 5 

SEQ ID NO:57 is Oligonucleotide 5.1dg 

SEQ ID NO:58 is Oligonucleotide 6 

SEQ ID NO:59 is Oligonucleotide 6.1dg 
30 SEQ ID NO:60 is Oligonucleotide 7 
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SEQ ID NO:61 is Oligonucleotide 8 
SEQ ID NO:62 is Oligonucleotide 9 
SEQ ID NO: 63 is Oligonucleotide 9L3A 
SEQ ID NO:64 is Oligonucleotide 10L3A 
5 SEQ ID NO:65 is Oligonucleotide 9L3AKDEL 
SEQ ID NO:66 is Oligonucleotide 10L3AKDEL 
SEQ ID NO: 67 is Oligonucleotide 9.2 A3 
SEQ ID NO:68 is Oligonucleotide 10.2A3 
SEQ ID NO:69 is Oligonucleotide 9.3A3/ser68 

10 SEQ ID NO:70 is Oligonucleotide 10.3A3/ser68 
SEQ ID NO:71 is Oligonucleotide 9.3A3/val68 
SEQ ID NO:72 is Oligonucleotide 10.3A3/val68 
SEQ ID NO: 73 is Oligonucleotide 10 
SEQ ID NO:74 is Oligonucleotide 1 1 

1 5 SEQ ID NO:75 is Oligonucleotide 12 
SEQ ID NO:76 is Oligonucleotide 13 
SEQ ID NO:77 is Oligonucleotide 14 
SEQ ID NO:78 is Oligonucleotide 15 
SEQ ID NO: 79 is Oligonucleotide 16 

20 SEQ ID NO:80 is Oligonucleotide 1 5KDEL 
SEQ ID NO: 81 is Oligonucleotide 16KDEL 
SEQ ID NO:82 is Oligonucleotide PI 
SEQ ID NO: 83 is Oligonucleotide P2 
SEQ ID NO: 84 is Fv heavy forward primer 

25 SEQ ID NO : 85 is Fv heavy back primer 
SEQ ID NO: 86 is Ca3 forward primer 
SEQ ID NO: 87 is Ca3 back primer 
SEQ ID NO: 88 is Fvk forward primer 
SEQ ID NO: 89 is Fvk back primer 

30 SEQ ID NO:90 is nucleotide linker segment 
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SEQ ID NO:91 is nucleotide linker complement 
SEQ ID NO: 92 is nucleotide signal peptide 
SEQ ID NO: 93 is heavy chain forward primer 
SEQ ID NO: 94 is heavy chain back primer 
5 SEQ ID NO: 95 is kappa forward primer 
SEQ ID NO:96 is kappa back primer 
SEQ ID NO: 97 is nucleotide heavy chain signal peptide 
SEQ ID NO: 98 is nucleotide light chain signal peptide 
SEQ ID NO: 99 is synthetic peptide linker 
10 SEQ ID NO: 100 is nuclear targeting sequence 1 
SEQ ID NO: 101 is nuclear target sequence 2 

SEQ ID NO: 102 is HDEL linker sequence for intracellular targeting 
SEQ ID NO: 103 is Oligonucleotide Tpl 
SEQ ID NO: 104 is Oligonucleotide Tp2 
1 5 SEQ ID NO: 1 05 is Oligonucleotide Tp3 
SEQ ID NO: 106 is Oligonucleotide Tp4 
SEQ ID NO: 107 is Oligonucleotide Tp5 
SEQ ID NO: 108 is Oligonucleotide Tp6 

SEQ ID NO: 109 is the substrate recognition sequence for matrix metalloproteinases 
20 SEQ ID NO: 1 10 is linker from substrate recognition sequence for MMPs 

SEQ ID NO: 111 is the poly immunoglobulin receptor from residues 601 to 630 
SEQ ID NO: 1 12 is a portion of human IgAl CH2 region 

SEQ ID NO: 1 13 is a scissile peptide recognized and bound by the anti-myc antibody 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION : 

(i) APPLICANTS: Hein, Mich B. 

Hiatt, Andrew C. 
Fitchen, John H. 

(ii) TITLE OF INVENTION: NOVEL EPITHELIAL TISSUE TARGETING AGENT 
(iii) NUMBER OF SEQUENCES: 113 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SEED and BERRY LLP 

(B) STREET: 6300 Columbia Center, 701 Fifth Avenue 

(C) CITY: Seattle 

(D) STATE : Washington 

(E) COUNTRY: USA 

(F) ZIP: 98104 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patent In Release #1.0 , Version #1.3 0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 09-UAN-1998 

(C) CLASSIFICATION: 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Maki, David J. 

(B) REGISTRATION NUMBER: 31,392 

(C) REFERENCE /DOCKET NUMBER: 310098. 401C1 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (206) 622-4900 

(B) TELEFAX: (206) 6 82-6031 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : linear 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 1 : 

Gin Glu Asp Glu Arg lie Val Leu Val Asp Asn Lys Cys Lys Cys Ala 
15 10 15 

Arg lie Thr Ser Arg lie lie Arg Ser Ser Glu Asp Pro Asn Glu Asp 
20 25 30 

lie Val Glu Arg Asn lie Arg lie He Val Pro Leu Asn Asn Arg Glu 
35 40 45 

Asn He Ser Asp Pro Thr Ser Pro Leu Arg Thr Arg Pro Val Tyr His 
50 55 60 

Leu Ser Asp Leu Cys Lys Lys Cys Asp Pro Thr Glu Val Glu Leu Asp 
65 70 75 80 

Asn Gin He Val Thr Ala Thr Gin Ser Asn He Cys Asp Glu Asp Ser 
85 90 95 

Ala Thr Glu Thr Cys Tyr Thr Tyr Asp Arg Asn Lys Cys Tyr Thr Ala 
100 105 110 

Val Val Pro Leu Val Tyr Gly Gly Glu Thr Lys Met Val Glu Thr Ala 
115 120 125 

Leu Thr Pro Asp Ala Cys Tyr Pro Asp 
130 135 

INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13 5 amino acids 

(B) TYPE r amino acid 

( C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ I] 

Gin Asp Glu Asn Glu Arg He Val 
1 5 

Arg He Thr Ser Arg He He Pro 
20 

He Val Glu Arg Asn Val Arg He 
35 40 



) NO : 2 : 

Val Asp Asn Lys Cys Lys Cys Ala 
10 15 

Ser Ala Glu Asp Pro Ser Gin Asp 
25 30 

He Val Pro Leu Asn Ser Arg Glu 
45 



Asn He Ser Asp Pro Thr Ser Pro Met Arg Thr Lys Pro Val Tyr His 
50 55 60 
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Leu Ser Asp Leu Cys Lys Lys Cys Asp Thr Thr Glu Val Glu Leu Glu 
65 70 75 80 

Asp Gin Val Val Thr Ala Ser Gin Ser Asn lie Cys Asp Ser Asp Ala 
85 90 95 

Glu Thr Cys Tyr Thr Tyr Asp Arg Asn Lys Cys Tyr Thr Asn Arg Val 
100 105 110 

Lys Leu Ser Tyr Arg Gly Gin Thr Lys Met Val Glu Thr Ala Leu Thr 
115 120 125 

Pro Asp Ser Cys Tyr Pro Asp 
130 135 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

Asp Asp Glu Ala Thr lie Leu Ala Asp Asn Lys Cys Met Cys Thr Arg 
15 10 15 

Val Thr Ser Arg lie lie Pro Ser Thr Glu Asp Pro Asn Glu Asp lie 
20 25 30 

Val Glu Arg Asn lie Arg lie Val Val Pro Leu Asn Asn Arg Glu Asn 
35 40 45 

lie Ser Asp Pro Thr Ser Pro Leu Arg Arg Asn Pro Val Tyr His Leu 
50 55 60 

Ser Asp Val Cys Lys Lys Cys Asp Pro Val Glu Val Glu Leu Glu Asp 
65 70 75 80 

Gin Val Val Thr Ala Thr Gin Ser Asn lie Cys Asn Glu Asp Asp Gly 
85 90 95 

Val Pro Glu Thr Cys Tyr Met Tyr Asp Arg Asn Lys Cys Tyr Thr Thr 
100 105 110 

Met Val Pro Leu Arg Tyr His Gly Glu Thr Lys Met Val Gin Ala Ala 
115 120 125 



Leu Thr Pro Asp Ser Cys Tyr Pro Asp 
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130 135 
(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Glu Asp Glu Ser Thr Val Leu Val Asp Asn Lys Cys Gin Cys Val Arg 
15 10 15 

lie Thr Ser Arg lie lie Arg Asp Pro Asp Asn Pro Ser Glu Asp lie 
20 25 30 

Val Glu Arg Asn lie Arg lie lie Val Pro Leu Asn Thr Arg Glu Asn 
35 40 45 

lie Ser Asp Pro Thr Ser Pro Leu Arg Thr Glu Pro Lys Tyr Asn Leu 
50 55 60 

Ala Asn Leu Cys Lys Lys Cys Asp Pro Thr Glu lie Glu Leu Asp Asn 
65 70 75 80 

Gin Val Phe Thr Ala Ser Gin Ser Asn lie Cys Pro Asp Asp Asp Tyr 
85 90 95 

Ser Glu Thr Cys Tyr Thr Tyr Asp Arg Asn Lys Cys Tyr Thr Thr Leu 
100 105 110 

Val Pro lie Thr His Arg Gly Val Thr Arg Met Val Lys Ala Thr Leu 
115 120 125 

Thr Pro Asp Ser Cys Tyr Pro Asp 
130 135 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 119 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : linear 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

Glu Gin Glu Tyr lie Leu Ala Asn Asn Lys Cys Lys Cys Val Lys lie 
15 10 15 

Ser Ser Arg Phe Val Pro Ser Thr Glu Arg Pro Gly Glu Glu lie Leu 
20 25 30 

Glu Arg Asn lie Gin lie Thr lie Pro Thr Ser Ser Arg Met Xaa lie 
35 40 45 

Ser Asp Pro Tyr Ser Pro Leu Arg Thr Gin Pro Val Tyr Asn Leu Trp 
50 55 60 

Asp lie Cys Gin Lys Cys Asp Pro Val Gin Leu Glu lie Gly Gly lie 
65 70 75 80 

Pro Val Leu Ala Ser Gin Pro Xaa Xaa Ser Xaa Pro Asp Asp Glu Cys 
85 90 95 

Tyr Thr Thr Glu Val Asn Phe Lys Lys Lys Val Pro Leu Thr Pro Asp 
100 105 110 

Ser Cys Tyr Glu Tyr Ser Glu 
115 



INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Asn Lys Cys Met Cys Thr Arg Val Thr Ala Arg lie Arg Gly Thr Arg 
1 5 10 15 

Glu Asp Pro Asn Glu Asp lie Val Glu Arg Tyr lie Arg lie Asn Val 
20 25 30 

Pro Leu Lys Asn Arg Gly Asn lie Ser Asp Pro Thr Ser Pro Leu Arg 
35 40 45 

Asn Gin Pro Val Tyr His Leu Ser Pro Ser Cys Lys Lys Cys Asp Pro 
50 55 60 



Tyr Glu Asp Gly Val Val Thr Ala Thr Glu Thr Asn lie Cys Tyr Pro 
65 70 75 80 
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Asp Gin Gly Val 



Arg Asn Lys Cys 
100 

Thr Lys Met Val 
115 



Pro Gin Ser Cys 
85 

Tyr Thr Val Leu 

Gin Asn Ala Leu 
120 



Arg Asp Tyr Cys 
90 

Val Pro Pro Gly 
105 

Thr Pro Asp Ala 



Pro Glu Leu Asp 
95 

Tyr Thr Gly Glu 
110 

Cys Tyr Pro Asp 
125 



(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 421 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..414 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

^ r^A PPT ATT GTT CTG GTT GAC AAC AAG TGC AAG TGT 

Z Sn Su Z Su Z £ " -a «1 «P « Lys C ^ cys 



1 



e s s s = s s s s = s s s s s £ 



20 25 



„ ATA GTC GAA COT AAC ATC COT ATC ATC «= CCA CTG AAT - CGG 

Asp He Val Glu Arg Asn He Arg He lie va ^ 



35 40 



GAG AAT ATC TCA GAT OCT ACA AGT CCG TTG CGC ACA CGC TTC GTA TAC 
Glu Asn lie Ser Asp Pro Thr Ser Pro Leu Arg Thr Arg P 

50 55 
CAC CTG TCA GAT CTG TGT AAG TGT GAT CCA ACA GAC GTA GAG CTG 

His Leu ser Asp Leu Cys Lys Lys Cys Asp Pro Thr Glu v» ^ 

70 

« E s s s s s s ss = ss S ^ S £ Sp 

85 90 



s S S S £ c- = s s s s =s s 5 s s 

100 105 



48 



96 



144 



192 



240 



288 



336 
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GCC GTG GTT CCG CTC GTG TAT GGT GGA GAG ACA AAA ATG GTG GAA ACT 3 84 

Ala Val Val Pro Leu Val Tyr Gly Gly Glu Thr Lys Met Val Glu Thr 

115 120 125 

GCC CTT ACG CCC GAT GCA TGC TAT CCG GAC TGAATTC 421 

Ala Leu Thr Pro Asp Ala Cys Tyr Pro Asp 

130 135 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 215 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .213 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

GAT CAG AAG TGC AAG TGT GCT CGT ATT ACT TCT AGA ATC ATC CGT AGC 4 8 

Asp Gin Lys Cys Lys Cys Ala Arg lie Thr Ser Arg lie lie Arg Ser 
15 10 15 

TCA GAG GAC CCA AAT GAA GAT ATA GTC GAA CGT AAC ATC CGT ATC ATC 96 
Ser Glu Asp Pro Asn Glu Asp lie Val Glu Arg Asn lie Arg lie lie 
20 25 30 

GTC CCA CTG AAT AAC CGG GAG AAT ATC TCA GAT CCT ACA AGT CCG TTG 144 
Val Pro Leu Asn Asn Arg Glu Asn lie Ser Asp Pro Thr Ser Pro Leu 
35 40 45 

CGC ACA CGC TTC GTA TAC CAC CTG TCA GAT CTG TGT AAG AAG GAT GAG 192 
Arg Thr Arg Phe Val Tyr His Leu Ser Asp Leu Cys Lys Lys Asp Glu 
50 55 60 

GAC AGC GCT ACA GAA ACC TGC TG 215 
Asp Ser Ala Thr Glu Thr Cys 
65 70 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; single 

( D ) TOPOLOGY : 1 inear 



94 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 
C TAGAATCAT CCGTAGCTCA GAGGAC CCAA ATGAAGATAT AGTCGAACGT AACATCCGTA 6 0 

TCATCGTCCC ACTGAATAAC CGGGAGAATA TCTCAGATCC TACAAGTCCG TTGCGCACAC 12 0 

GCTTCGTATA CCACCTGTCA 14 0 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GATCAGAAGT GCAAGTGTGC TCGTATTACT T 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .42 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11; 

GAT CTG TGT AAG AAG GAT GAA GAT TCC GCT ACA GAA ACC TGC 42 
Asp Leu Cys Lys Lys Asp Glu Asp Ser Ala Thr Glu Thr Cys 
75 80 85 

TG 44 



(2) INFORMATION FOR SEQ ID NO : 12 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 109 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 12 : 
GCACCTACGA TAGGAACAAA TGCTACACGG CCGTGGTTCC GCTCGTGTAT GGTGGAGAGA 6 0 

CAAAAATGGT GGAAACTGCC CTTACGCCCG ATGCATGCTA CCCTGACTG 10 9 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 86 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1. .282 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GAC AAC AAG TGC AAG TGT GCT CGT ATT ACT TCT AGA ATC ATC CGT AGC 4 8 

Asp Asn Lys Cys Lys Cys Ala Arg lie Thr Ser Arg lie lie Arg Ser 
i5 20 25 30 

TCA GAG GAC CCA AAT GAA GAT ATA GTC GAA CGT AAC ATC CGT ATC ATC 96 
Ser Glu Asp Pro Asn Glu Asp lie Val Glu Arg Asn lie Arg lie lie 
35 40 45 

GTC CCA CTG AAT AAC CGG GAG AAT ATC TCA GAT CCT ACA AGT CCG TTG 144 
Val Pro Leu Asn Asn Arg Glu Asn He Ser Asp Pro Thr Ser Pro Leu 
50 55 60 

CGC ACA CGC TTC GTA TAC CAC CTG TCA GAT CTG TGT AAG AAG TGT GAT 192 
Arg Thr Arg Phe Val Tyr His Leu Ser Asp Leu Cys Lys Lys Cys Asp 
65 70 75 

CCA ACA GAG GTA GAG CTG GAC AAT CAG ATA GTC ACT GCG ACT CAA AGC 240 
Pro Thr Glu Val Glu Leu Asp Asn Gin He Val Thr Ala Thr Gin Ser 
80 85 90 

AAC ATT TGC GAT GAG GAC AGC GCT ACA GAA ACC TGC TAC TGA 2 82 

Asn He Cys Asp Glu Asp Ser Ala Thr Glu Thr Cys Tyr * 
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95 100 105 

ATTC 286 



(2) INFORMATION FOR SEQ ID NO : 14 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 105 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 105 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 14 : 

GAT CTG TGT AAG AAG TGT GAT CCA AC A GAG GTA GAG CTG GAC AAT CAG 4 8 

Asp Leu Cys Lys Lys Cys Asp Pro Thr Glu Val Glu Leu Asp Asn Gin 

95 100 105 110 

ATA GTC ACT GCG ACT CAA AGC AAC ATT TGC GAT GAG GAC AGC GCT ACA 96 

lie Val Thr Ala Thr Gin Ser Asn lie Cys Asp Glu Asp Ser Ala Thr 
115 120 125 

CTT TGG ACG 105 
Leu Trp Thr 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GATCAGGAAG ATGAACGTAT TGTTCTGGTT GACAACAAGT GCAAGTGTGC TCGTATTACT 60 
T 61 
(2) INFORMATION FOR SEQ ID NO: 16: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 198 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 

GCGATGACGA CGATAAGGCC CAAACGGAGA CCTGTACTGT TGCGCCTCGT GAACGGCAAA 6 0 

ACTGCGGATT CCCGGAAGTA ACACCCTCTC AGTGCGCTAA TAAAGGCTGC TGTTTTGATG 12 0 

ACACGGTACG GGGCGTTCCG TGGTGCTTCT ACCCCAATAC AATTGACGTT CCGCCTGAAG 18 0 

AAGAGTGCGA GCCGTAAG 198 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 138 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Asp Gin Glu Asp Glu Arg lie Val Leu Val Asp Asn Lys Cys Lys Cys 
15 10 15 

Ala Arg lie Thr Ser Arg lie lie Arg Ser Ser Glu Asp Pro Asn Glu 
20 25 30 

Asp lie Val Glu Arg Asn lie Arg lie lie Val Pro Leu Asn Asn Arg 
35 40 45 

Glu Asn lie Ser Asp Pro Thr Ser Pro Leu Arg Thr Arg Phe Val Tyr 
50 55 60 

His Leu Ser Asp Leu Cys Lys Lys Cys Asp Pro Thr Glu Val Glu Leu 
65 70 75 80 

Asp Asn Gin lie Val Thr Ala Thr Gin Ser Asn lie Cys Asp Glu Asp 
85 90 95 

Ser Ala Thr Glu Thr Cys Ser Thr Tyr Asp Arg Asn Lys Cys Tyr Thr 
100 105 110 

Ala Val Val Pro Leu Val Tyr Gly Gly Glu Thr Lys Met Val Glu Thr 
115 120 125 
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Ala Leu Thr Pro Asp Ala Cys Tyr Pro Asp 
130 135 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 71 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

Asp Gin Lys Cys Lys Cys Ala Arg lie Thr Ser Arg lie lie Arg Ser 
15 10 15 

Ser Glu Asp Pro Asn Glu Asp lie Val Glu Arg Asn lie Arg lie lie 
20 25 30 

Val Pro Leu Asn Asn Arg Glu Asn lie Ser Asp Pro Thr Ser Pro Leu 
35 40 45 

Arg Thr Arg Phe Val Tyr His Leu Ser Asp Leu Cys Lys Lys Asp Glu 
50 55 60 

Asp Ser Ala Thr Glu Thr Cys 
65 70 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ I] 

Ser Arg lie lie Arg Ser Ser Glu 
1 5 

Arg Asn lie Arg lie lie Val Pro 
20 

Asp Pro Thr Ser Pro Leu Arg Thr 
35 40 

Leu 



) NO : 1 9 : 

Asp Pro Asn Glu Asp lie Val Glu 
10 15 

Leu Asn Asn Arg Glu Asn lie Ser 
25 30 

Arg Phe Val Tyr His Leu Ser Asp 
45 
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(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : 

{ D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

Asp Gin Lys Cys Lys Cys Ala Arg lie Thr Ser Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Asp Leu Cys Lys Lys Asp Glu Asp Ser Ala Thr Glu Thr Cys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Ser Thr Tyr Asp Arg Asn Lys Cys Tyr Thr Ala Val Val Pro Leu Val 
15 10 15 

Tyr Gly Gly Glu Thr Lys Met Val Glu Thr Ala Leu Thr Pro Asp Ala 
20 25 30 

Cys Tyr Pro Asp 
35 

(2) INFORMATION FOR SEQ ID NO: 23: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

Asp Asn Lys Cys Lys Cys Ala Arg lie Thr Ser Arg lie lie Arg Ser 
15 10 15 

Ser Glu Asp Pro Asn Glu Asp lie Val Glu Arg Asn lie Arg lie lie 
20 25 30 

Val Pro Leu Asn Asn Arg Glu Asn lie Ser Asp Pro Thr Ser Pro Leu 
35 40 45 

Arg Thr Arg Phe Val Tyr His Leu Ser Asp Leu Cys Lys Lys Cys Asp 
50 55 60 

Pro Thr Glu Val Glu Leu Asp Asn Gin lie Val Thr Ala Thr Gin Ser 
65 70 75 80 

Asn lie Cys Asp Glu Asp Ser Ala Thr Glu Thr Cys Tyr 
85 90 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

Asp Leu Cys Lys Lys Cys Asp Pro Thr Glu Val Glu Leu Asp Asn Gin 
15 10 15 

lie Val Thr Ala Thr Gin Ser Asn lie Cys Asp Glu Asp Ser Ala Thr 
20 25 30 

Leu Trp Thr 

35 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : linear 



101 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Asp Gin Glu Asp Glu Arg lie Val Leu Val Asp Asn Lys Cys Lys Cys 
15 10 15 

Ala Arg lie Thr Ser Arg 
20 

(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Cys Ser Asp Asp Asp Asp Lys Ala Gin Thr Glu Thr Cys Thr Val Ala 
15 10 15 

Pro Arg Glu Arg Gin Asn Cys Gly Phe Pro Gly Val Thr Pro Ser Gin 
20 25 30 

Cys Ala Asn Lys Gly Cys Cys Phe Asp Asp Thr Val Arg Gly Val Pro 
35 40 45 

Trp Cys Phe Tyr Pro Asn Thr lie Asp Val Pro Pro Glu Glu Glu Cys 
50 55 60 

Glu Phe 
65 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 421 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
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CTAGTCCTTC TACTTGCATA ACAAGACCAA CTGTTGTTCA CGTTCACACG AGCATAATGA 



60 



AGATCTTAGT AGGCATCGAG TCTCCTGGGT TTACTTCTAT ATCAGCTTGC ATTGTAGGCA 



120 



TAGTAGCAGG GTGACTTATT GGCCCTCTTA TAGAGTCTAG GATGTTCAGG CAACGCGTGT 



180 



GCGAAGCATA TGGTGGACAG TCTAGACACA TTCTTCACAC TAGGTTGTCT CCATCTCGAC 



240 



CTGTTAGTCT ATCAGTGACG CTGAGTTTCG TTGTAAACGC TACTCCTGTC GCGATGTCTT 



300 



TGGACGTCGT GGATGCTATC CTTGTTTACG ATGTGCCGGC ACCAAGGCGA GCACATACCA 



360 



CCTCTCTGTT TTTACCACCT TTGACGGGAA TGCGGGCTAC GTACGATAGG CCTGACTTAA 



420 



G 



421 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 219 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

CTAGTCTTCA CGTTCACACG AGCATAATGA AGATCTTAGT AGGCATCGAG TCTCCTGGGT 6 0 

TTACTTCTAT ATCAGCTTGC ATTGTAGGCA TAGTAGCAGG GTGACTTATT GGCCCTCTTA 12 0 

TAGAGTCTAG GATGTTCAGG CAACGCGTGT GCGAAGCATA TGGTGGACAG TCTAGACACA 18 0 

TTCTTCCTAC TCCTGTCGCG ATGTCTTTGG ACGACTTAA 219 
(2) INFORMATION FOR SEQ ID NO:29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
TTAGTAGGCA TCGAGTCTCC TGGGTTTACT TCTATATCAG CTTGCATTGT AGGCATAGTA 60 
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GCAGGGTGAC TTATTGGCCC TCTTATAGAG TCTAGGATGT TCAGGCAACG CGTGTGCGAA 12 0 

GCATATGGTG GACAGT CTAG 14 0 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30: 
TCTTCACGTT CACACGAGCA TAATGAAGAT C 31 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
ACACATTCTT CCTACTTCTC AGGCGATGTC TTTGGACGAC TTAA 44 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 117 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 

ACGTCGTGGA TGCTATCCTT GTTTACGATG TGCCGGCACC AAGGCGAGCA CATACCACCT 6 0 

CTCTGTTTTT ACCACCTTTG ACGGGAATGC GGGCTACGTA CGATGGGACT GACTTAA 117 
(2) INFORMATION FOR SEQ ID NO: 33: 



104 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 82 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
CTGTTGTTCA CGTTCACACG AGCATAATGA AGATCTTAGT AGGCATCGAG TCTC CTGGGT 6 0 

TTACTTCTAT ATCAGCTTGC ATTGTAGGCA TAGTAGCAGG GTGACTTATT GGCCCTCTTA 12 0 

TAGAGTCTAG GATGTTCAGG CAACGCGTGT GCGAAGCATA TGGTGGACAG TCTAGACACA 180 

TTCTTCACAC TAGGTTGTCT CCATCTCGAC CTGTTAGTCT ATCAGTGACG CTGAGTTTCG 24 0 

TTGTAAACGC TACTCCTGTC GCGATGTCTT TGGACGATGA CT 2 82 
(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 105 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
GATCTGTGTA AGAAGTGTGA TCCAACAGAG GTAGAGCTGG ACAATCAGAT AGTCACTGCG 6 0 

ACTCAAAGCA ACATTTGCGA TGAGGACAGC GCTACACTTT GGACG 105 
(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
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CTAGTCCTTC TACTTGCATA ACAAGACCAA CTGTTGTTCA CGTTCACACG AG C AT AATGA 6 0 

AGATC 65 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 06 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 

ACTTCGCTAC TGCTGCTATT CCGGGTTTGC CTCTGGACAT GACAACGCGG AGCACTTGCC 6 0 

GTTTTGACGC CTAAGGGCCT TCATTGTGGG AGAGTCACGC GATTATTTCC GACGACAAAA 12 0 

CTACTGTGCC ATGCCCCGCA AGGCACCACG AAGATGGGGT TATGTTAACT GCAAGGCGGA 18 0 

CTTCTTCTCA CGCTCGGCAT TCTTAA 2 06 
(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

Asp Gin Glu Asp Glu Arg lie Val Leu Val Asp Asn Lys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Glu Asn Leu Tyr Phe Gin Ser 
1 5 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Lys Ala His Lys Val Asp Met Val Gin Tyr Thr 
15 10 

(2) INFORMATION FOR SEQ ID NO:40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Val Gin Tyr Thr 
1 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
Glu Lys Ala Val Ala Asp 
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1 5 
(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1 . . 7 8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 

ATG AAA TTC TTA GTC AAC GTT GCC CTT TTT ATG GTC GTA TAC ATT TCT 48 
Met Lys Phe Leu Val Asn Val Ala Leu Phe Met Val Val Tyr lie Ser 
40 45 50 

TAC ATC TAT GCG GAT CCG AGC TCG AGT GCT CTAGATCTGC AGCTGGTACC 98 
Tyr lie Tyr Ala Asp Pro Ser Ser Ser Ala 
55 60 

ATGGAATTCG AAGCTTGGAG TCGACTCTGC TGA 131 



(2) INFORMATION FOR SEQ ID NO:43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 

Met Lys Phe Leu Val Asn Val Ala Leu Phe Met Val Val Tyr lie Ser 
15 10 15 

Tyr lie Tyr Ala Asp Pro Ser Ser Ser Ala 
20 25 

(2) INFORMATION FOR SEQ ID NO:44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Lys Asp Glu Leu 
1 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

Ala lie Gin Asp Pro Arg Leu Phe Ala Glu Glu Lys Ala Val Ala Asp 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 
GATCAGGAAG ATGAACGTAT TGTTCTGGTT GACAACAAGT GCAAGTGTGC TCGTATTACT 6 0 

T 61 
(2) INFORMATION FOR SEQ ID NO:47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
CTAGAAGTAA TACGAGCACA CTTGCACTTG TTGTCAACCA GAACAATACG TTCATCTTCC 6 0 

T 61 
(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: 
GATCAGAAGT GCAAGTGTGC TCGTATTACT T 31 
(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: 
CTAGAAGTAA TACGAGCACA CTTGCACTTC T 31 
(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
GATCAGGAAG ATGAACGTAT TGTTCTGGTT GACAACAAGT GCAAGTCCGC TCGTATTACT 



60 
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(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
CTAGAAGTAA TACGAGCGGA CTTGCACTTG TTGTCAACCA GAACAATACG TTCATCTTCC 6 0 

T 61 
(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
GATCAGGAAG ATGAACGTAT TGTTCTGGTT GACAACAAGT GCAAGGTTGC TCGTATTACT 60 
T 61 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:53: 
CTAGAAGTAA TACGAGCAAC CTTGCACTTG TTGTCAACCA GAACAATACG TTCATCTTCC 
T 



60 
61 
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(2) INFORMATION FOR SEQ ID NO:54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
CTAGAATCAT CCGTAGCTCA GAGGACCCAA ATGAAGATAT AGTCGAA 4 7 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
GATACGGATG TTACGTTCGA CTATATCTTC ATTTGGGTCC TCTGAGCTAC GGATGATT 5 8 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
CGTAACATCC GTATCATCGT CCCACTGAAT AACCGGGAGA ATATCTCAG 4 9 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 
CGTAACATCC GTATCATCGT CCCACTGAAT AAC CGGGAGC ACATCTCAG 4 9 

(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:58: 
ACGGACTTGT AGGATCTGAG ATATTCTCCC GGTTATTCAG TGGGACGAT 4 9 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 
ACGGACTTGT AGGATCTGAG ATGTGCTCCC GGTTATTCAG TGGGACGAT 4 9 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:60: 
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ATCCTACAAG TCCGTTGCGC ACACGCTTCG TAT AC C AC C T GTCA 44 
(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
GAT CTGAC AG GTGGTATACG AAGCGTGTGC GCA 33 
(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:62: 
GATCTGTGTA AGAAGTGTGA TCCAACAGAG GTAGAGCTGG ACAATCAGAT AGTCACTGCA 6 0 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
GATCTGTGTA AGAAGGATGA GGACAGCGCT ACAGAAACCT GCTG 44 
(2) INFORMATION FOR SEQ ID NO: 64: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH; 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
AATTCAGCAG GTTT CTGTAG CGCTGTCCTC ATCCTTCTTA CACA 44 
(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 62 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

GATCTGTGTA AGAAGGATGA GGACAGCGCT ACAGAAACCT GCTACGAGAA GGATGAGCTG 6 0 

TG 62 
(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 62 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
AATTCACAGC TCATCCTTCG CGTCGCAGGT TTCTGTAGCG CTGTCCTCAT CCTTCTTACA 6 0 

CA 62 
(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 9 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
GATCTGTGTA AGAAGTCTGA TATCGATGAA GATTCCGCTA CAGAAACCTG CAGCACATG 5 9 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
AATTCATGTG CTGCAGGTTT CTGTAGCGGA ATCTTCATCG AT AT C AGACT TCTTACACA 5 9 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 
GATCTGTCTA AGAAGTCTGA TATCGATGAA GATTACAGAT TCTTCAGACT ATAGCTACTT 6 0 

CTAA 64 
(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
AATCTTCATC GATATCAGAC TTCTTAGACA 
(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
GATCTGGTTA AGAAGTCTGA TAT CGATGAA GATTACCAAT TCTTCAGACT ATAGCTACTT 
CTAA 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
AATCTTCATC GATATCAGAC TTCTTAACCA 
(2) INFORMATION FOR SEQ ID NO:73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: 
ATTGTCCAGC TCTACCTCTG TTGGATCACA CTTCTTACAC A 
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(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 74 : 
ACTCAAAGCA ACATTTGCGA TGAGGACAGC GCTACAGAAA CCTGCA 4 6 

(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
GGTTTCTGTA GCGCTCTGCT CATCGCAAAT GTTGCTTTGA GTCGCAGTGA CTATCTG 57 
(2) INFORMATION FOR SEQ ID NO:76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 59 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:76: 
GCACCTACGA TAGGAACAAA TGCTACACGG CCGTGGTTCC GCTCGTGTAT GGTGGAGAG 5 9 

(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
GAGCGGAACC ACGGC CGTGT AGCATTTGTT CCTATCGTAG GTGCTGCA 48 
(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 
ACAAAAATGG TGGAAACTGC CCTTACGCCC GATGCATGCT ATCCGGACTG 5 0 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

AATTCAGTCC GGATAGCATG CATCGGGCGT AAGGGCAGTT TCCACCATTT TTGTCTCTCC 6 0 

ACCATACAC 6 9 
(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 62 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
ACAAAAATGG TGGAAACTGC CCTTACGCCC GATGCATGCT ATCCGGACAA GGATGAATTG 



(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 
AATTCACAAT TCATCCTTGT CCGGATAGCA TGCATCGGGC GTAAGGGCAG TTTCCACCAT 
TTTTGTCTCT CCACCATACA C 
(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 88 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
GATCAGGTCG CTGCCATCCA AGACCCGAGG CTGTTCGCCG AAGAGAAGGC CGTCGCTGAC 
TCCAAGTGCA AGTGTGCTCG TATTACTT 
(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:83: 
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CTAGAAGTAA TACGAGCACA CTTGCACTTG GAGTCAGCGA CGGCCTTCTC TTCGGCGAAC 6 0 

AGCCTCGGGT CTTGGATGGC AGCGACCT 8 8 

(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:84: 
TGGTACGAAT TCCAGGTSMA RCTGCAGSAG TCRG 34 
(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 
ACAGATATCG GGATTTCTCG CAGACTC 27 
(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 
AC AGAATAT C GTCAACACCT TCCCACCC 
(2) INFORMATION FOR SEQ ID NO: 87: 



28 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
ACAAAGCTTT TATTTACCCG ACAGACGGTC 
(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
GTCCCCCCTC GAGCGAYATY SWGMTSACCC ARTCT 
(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
ACACTGCAGC AGTTGGTGCA GCATCAGC 
(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 
CTGCAGGAAG CGGAAGCGGA GGAAGCGGAA GCGGAGGAAG CGGAAGCGAA TTC 53 
(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 91: 
CCTTCGCCTT CGCCTCCTTC GCCTTCGCCT CCTTCGCCTT CGCTTAA 4 7 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 
ACAGGATCCA TGGAAACCCC AGCGCAGCTT CTCTTCCTCC TGCTACTCTG GCTCCCAAGA 6 0 

TACCACCGGA CCCGGG 76 
(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 
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TGGTACAGAT CTAGGTSMAR CTGCAGSAGT CRG 
(2) INFORMATION FOR SEQ ID NO : 94 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
ACAGGAATTC AATTTTCTTG TCCACCTT 
(2) INFORMATION FOR SEQ ID NO:95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 9 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
GTTCTAGAGA YATYSWGMTS ACCCARTCT 
(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
ACACCGCGGC AGTTGGTGCA GCATCAGC 
(2) INFORMATION FOR SEQ ID NO: 97: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 75 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 
ACAGGATCCA TGGAAACCCC AGCGCAGCTT CTCTTCCTCC TGCTACTCTG GCTCCCAGAT 6 0 

ACCACCGGAA GATCT 75 
(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 75 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 
ACAACTAGTA TGGAAACCCC AGCGCAGCTT CTCTTCCTCC TGCTACTCTG GCTCCCAGAT 6 0 

ACCACCGGAT CTAGA 75 
(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 
{ C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

Val Ala Val Gin Ser Ala Gly Thr Pro Ala Ser Gly Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Cys Ala Ala Pro Lys Lys Lys Arg Lys Val 
15 10 

(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

Cys Ala Ala Lys Arg Pro Pro Ala Ala lie Lys Lys Ala Ala Ala Gly 
15 10 15 

Gin Ala Lys Lys Lys Lys 
20 

(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 
His Asp Glu Leu 



(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:103: 
GCGATGACGA CGATAAGGCC CAAACGGAGA CCTGTACTGT TGCGCCTCGT GAACGGCAAA 
ACTGCGGATT CCCGGAA 

(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 
GTTTTGCCGT TCACGAGGCG CAACAGTACA GGTCTC CGTT TGGGCCTTAT CGTCGTCATC 
GCTTCA 

(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 72 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 
GTAACACCCT CTCAGTGCGC TAATAAAGGC TGCTGTTTTG ATGACACGGT ACGGGGCGTT 
CCGTGGTGCT TC 

(2) INFORMATION FOR SEQ ID NO:106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 72 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 
GCCCCGTACC GTGTCATCAA AACAGCAGCC TTTATTAGCG CACTGAGAGG GTGTTACTTC 6 0 

CGGGAATCCG CA 72 
(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 
TACCCCAATA CAATTGACGT TCCGCCTGAA GAAGAGTGCG AGCCGTAAG 4 9 

(2) INFORMATION FOR SEQ ID NO:108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 
AATTCTTACG GCTCGCACTC TTCTTCAGGC GGCAAGTCAA TTGTATTGGG GTAGAAGCAC 6 0 

CACGGAAC 6 8 

(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Pro Leu Gly lie He Gly Gly 
1 5 



(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 
He He Gly Gly 



(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

Val Arg Asp Gin Ala Gin Glu Asn Arg Ala Ser Gly Asp Ala Gly 

15 10 15 

Ser Ala Asp Gly Gin Ser Arg Ser Ser Ser Ser Lys Val Leu Phe 

16 20 25 30 

(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY : linear 
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Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

Val Pro Ser Thr Pro Pro Thr Pro Ser Pro Ser Thr Pro Pro Thr 

15 10 15 

Pro Ser Pro Ser Cys Cys His Pro Arg Leu 

16 20 25 

(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

Glu Gin Lys Leu lie Ser Glu Asp Leu 
1 5 
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CLAIMS 

1. A targeting molecule linked to at least one biological agent 
wherein said targeting molecule comprises a polypeptide that: 

(a) forms a closed covalent loop; and 

(b) contains at least three peptide domains having p -sheet character, 
each of the domains being separated by domains lacking P-sheet character wherein said 
polypeptide is not full length dimeric IgA. 

2. A targeting molecule according to claim 1 wherein said targeting 
molecule is covalently linked to at least one biological agent. 

3. A targeting molecule according to claim 2 wherein said molecule 
contains at least one cysteine residue linked to the biological agent(s). 

4. A targeting molecule according to claim 2 wherein said molecule 
is linked to a biological agent via a peptide bond. 

5. A targeting molecule according to claim 1 wherein said molecule 
is noncovalently linked to at least one biological agent. 

6. A targeting molecule according to claim 1 wherein said 
polypeptide comprises amino acid residues 13-71 and 93-101 of SEQ ID NO:l, amino 
acid residues 13-71 and 93-99 of SEQ ID NO:2, amino acid residues 12-70 and 92-101 
of SEQ ID NO:3, amino acid residues 12-70 and 92-100 of SEQ ID NO:4, amino acid 
residues 11-69 and 89-96 of SEQ ID NO:5 and/or amino acid residues 3-61 and 79-88 
of SEQ ID NO: 6, or a variant thereof that differs only in conservative substitutions 
and/or modifications. 
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7. A targeting molecule according to claim 1 wherein said 
polypeptide comprises the amino acid sequence recited in SEQ ID NO:7 ? or a variant 
thereof that differs only in conservative substitutions and/or modifications. 

8. A targeting molecule according to claim 1 wherein said 
polypeptide comprises the amino acid sequence recited in SEQ ID NO: 8, or a variant 
thereof that differs only in conservative substitutions and/or modifications. 

9. A targeting molecule according to claim 1 wherein said 
polypeptide comprises the amino acid sequence recited in SEQ ID NO: 13, or a variant 
thereof that differs only in conservative substitutions and/or modifications. 

10. A targeting molecule according to claim 1 wherein said 
polypeptide contains at least four peptide domains having P-sheet character, separated 
by domains lacking p-sheet character. 

11. A targeting molecule according to claim 7 wherein said variant 
comprises amino acid residues 13-99 of SEQ ID NO:2, amino acid residues 12-101 of 
SEQ ID NO:3 ? amino acid residues 12-100 of SEQ ID NO:4, amino acid residues 1 1-95 
of SEQ ID NO:5 and/or amino acid residues 3-88 of SEQ ID NO:6 ? or a variant thereof 
that differs only in conservative substitutions and/or modifications. 

12. A targeting molecule according to claim 1 wherein said 
polypeptide further comprises a linear N-terminal domain. 

13. A targeting molecule according to claim 12 wherein said N- 
terminal domain comprises amino acid residues 1-12 of SEQ ID NO:l ? amino acid 
residues 1-12 of SEQ ID NO:2, amino acid residues 1-1 1 of SEQ ID NO:3, amino acid 
residues 1-11 of SEQ ID NO:4, amino acid residues 1-10 of SEQ ID NO:5 ? and/or 
amino acid residues 1-2 of SEQ ID NO:6, or a variant thereof that differs only in 
conservative substitutions and/or modifications. 
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14. A targeting molecule according to claim 1 wherein said 
polypeptide further comprises a C-terminal domain. 

15. A targeting molecule according to claim 14 wherein said C- 
terminal domain comprises a linear peptide having |3-sheet character. 

16. A targeting molecule according to claim 12 wherein said linear 
peptide comprises amino acid residues 102-108 of SEQ ID NO:l> amino acid residues 
100-106 of SEQ ID NO:2, amino acid residues 102-108 of SEQ ID NO:3, amino acid 
residues 101-107 of SEQ ID NO:4 and/or amino acid residues 89-99 of SEQ ID NO:6, 
or a variant thereof that differs only in conservative substitutions and/or modifications. 

17. A targeting molecule according to claim 14 wherein said C- 
terminal domain comprises a covalently closed loop. 

18. A targeting molecule according to claim 1 7 wherein the 
covalently closed loop within said C-terminal domain comprises amino acid residues 
109-137 of SEQ ID NO:l, amino acid residues 107-135 of SEQ ID NO:2, amino acid 
residues 109-137 of SEQ ID NO:3, amino acid residues 108-136 of SEQ ID NO:4, 
amino acid residues 96-119 of SEQ ID NO:5, and/or amino acid residues 100-128 of 
SEQ ID NO:6 3 or a variant thereof that differs only in conservative substitutions and/or 
modifications. 

19. A targeting molecule linked to at least one biological agent 
wherein said targeting molecule is a polypeptide comprising a sequence recited in any 
one of SEQ ID NO:l - SEQ ID NO:6. 

20. A targeting molecule linked to at least one biological agent 
wherein said targeting molecule is a polypeptide comprising a sequence recited in SEQ 
ID NO:7. 
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21. A targeting molecule linked to at least one biological agent 
wherein said targeting molecule is a polypeptide comprising a sequence recited in SEQ 
ID NO:8. 

22. A targeting molecule linked to at least one biological agent 
wherein said targeting molecule is a polypeptide comprising a sequence recited in SEQ 
ID NO:13. 

23. A targeting molecule according to any one of claims 19-22 
wherein said targeting molecule is covalently linked to at least one biological agent. 

24. A targeting molecule according to claim 23 wherein said 
targeting molecule contains at least one cysteine residue linked to the biological 
agent(s). 

25. A targeting molecule according to claim 23 wherein said 
molecule is linked to a biological agent via a peptide bond. 

26. A targeting molecule according to claim 23 wherein said 
molecule is linked to a biological agent via a glycoside bond. 

27. A targeting molecule according to claim 23 wherein said 
molecule is linked to a biological agent via a phosphodiester bond. 

28. A targeting molecule according to any one of claims 19-22 
wherein said molecule is noncovalently linked to at least one biological agent. 

29. A targeting molecule capable of specifically binding to a 
basolateral factor associated with an epithelial surface and causing the internalization of 
a biological agent linked thereto, wherein the targeting molecule is not full length 
dimeric IgA. 
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30. A targeting molecule according to claim 1 or claim 29 wherein 
said biological agent is selected from the group consisting of enzymes, binding agents, 
inhibitors, nucleic acids, carbohydrates and lipids. 

31. A pharmaceutical composition comprising a targeting molecule 
linked to at least one biological agent according to claim 1 or claim 29, in combination 
with a pharmaceutically acceptable carrier. 

32. A method for treating a patient afflicted with a disease associated 
with an epithelial surface, comprising administering to a patient a pharmaceutical 
composition according to claim 3 1 . 

33. A method according to claim 32 wherein said patient is afflicted 
with a disease selected from the group consisting of cancer, viral infection, 
inflammatory disorders, autoimmune disorders, asthma, celiac disease, colitis, 
pneumonia, cystic fibrosis, bacterial infection, mycobacterial infection and fungal 
infection. 

34. A method for inhibiting the development in a patient of a disease 
associated with an epithelial surface, comprising administering to a patient a 
pharmaceutical composition according to claim 31. 

35. A method according to claim 34 wherein the disease is selected 
from the group consisting of cancer, viral infection, autoimmune disorders, asthma, 
celiac disease, colitis, pneumonia, cystic fibrosis, bacterial infection, mycobacterial 
infection and fungal infection. 

36. A targeting molecule linked to at least one biological agent 
wherein said targeting molecule comprises a polypeptide that: 

(a) forms a closed covalent loop; and 
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(b) contains at least three peptide domains having p-sheet character, 
each of the domains being separated by domains lacking p-sheet character wherein said 
targeting molecule is linked to at least one biological agent by a substrate for an 
intracellular or extracellular enzyme associated with or secreted from an epithelial 
barrier. 

37. A targeting molecule according to claim 36 wherein said enzyme 
is selected from the group consisting of proteases, glycosidases, phospholipases, 
esterases, hydrolases, and nucleases. 

38. A targeting molecule linked to at least one biological agent 
wherein said targeting molecule comprises a polypeptide that: 

(a) forms a closed covalent loop; and 

(b) contains at least three peptide domains having p-sheet character, 
each of the domains being separated by domains lacking p-sheet character wherein said 
targeting molecule is linked to at least one biological agent through a side chain of 
amino acids in an antibody combining site. 

39. A targeting molecule linked to at least one biological agent 
wherein said targeting molecule comprises a polypeptide that: 

(a) forms a closed covalent loop; and 

(b) contains at least three peptide domains having P-sheet character, 
each of the domains being separated by domains lacking p-sheet character wherein the 
biological agent is not naturally associated with the targeting molecule, and wherein the 
biological agent is not iodine. 

40. A targeting molecule according to claim 39 wherein said 
biological agent is selected from the group consisting of enzymes, binding agents, 
inhibitors, nucleic acids, carbohydrates and lipids. 
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41. A targeting molecule according to claim 39 wherein said 
biological agent comprises an antigen combining site. 
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NOVEL EPITHELIAL TISSUE TARGETING AGENT 

ABSTRACT OF THE DISCLOSURE 

Targeting molecules for use in delivering biological agents to epithelial 
tissue are disclosed. Upon delivery, the biological agent(s) may remain within an 
epithelial cell or may undergo transepithelial transport via transcytosis. . The targeting 
molecules may be used, for example, for the delivery of therapeutic agents. 
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RAW SEQUENCE LISTING 

PATENT APPLICATION US/09/005,318 



DATE: 02/20/98 
TIME: 10:50:14 



INPUT SET: S23610.raw 



This Raw Listing contains the General 
Information Section and up to the first 5 pages. 



SEQUENCE LISTING 



ESMT 



(i) 



General Information: 




D 



(i) APPLICANTS: Hein, Mich B. 
Hiatt, Andrew C. 
Fitchen, John H« 

(ii) TITLE OF INVENTION: NOVEL EPITHELIAL TISSUE TARGETING AGENT 
(iii) NUMBER OF SEQUENCES: 113 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SEED and BERRY LLP 

(B) STREET: 6300 Columbia Center, 701 Fifth Avenue 

(C) CITY: Seattle 

( D ) STATE : Washington 

(E) COUNTRY: USA 

(F) ZIP: 98104 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(Vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 09-JAN-1998 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Maki, David J. 

(B) REGISTRATION NUMBER: 31,3 92 

(C) REFERENCE/DOCKET NUMBER: 310098. 401C1 

(ix) TELECOMMUNICATION INFORMATION: 



(A) TELEPHONE: (206) 622-4900 

(B) TELEFAX: (206) 682-6031 



(2) INFORMATION FOR SEQ ID NO:l: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 



RAW SEQUENCE LISTING 

PATENT APPLICATION US/09/005,318 



DATE: 02/20/98 
TIME: 10:50:18 



INPUT SET: S23610.ruw 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

Gin Glu Asp Glu Arg lie Val Leu Val Asp Asn Lys Cys Lys Cys Ala 
15 10 15 

Arg lie Thr Ser Arg lie He Arg Ser Ser Glu Asp Pro Asn Glu Asp 
20 25 30 

He Val Glu Arg Asn He Arg He He Val Pro Leu Asn Asn Arg Glu 
35 40 45 

Asn He Ser Asp Pro Thr Ser Pro Leu Arg Thr Arg Pro Val Tyr His 
50 55 60 

Leu Ser Asp Leu Cys Lys Lys Cys Asp Pro Thr Glu Val Glu Leu Asp 
65 70 75 80 

Asn Gin He Val Thr Ala Thr Gin Ser Asn He Cys Asp Glu Asp Ser 
85 90 95 

Ala Thr Glu Thr Cys Tyr Thr Tyr Asp Arg Asn Lys Cys Tyr Thr Ala 
100 105 110 

Val Val Pro Leu Val Tyr Gly Gly Glu Thr Lys Met Val Glu Thr Ala 
115 120 125 

Leu Thr Pro Asp Ala Cys Tyr Pro Asp 
130 135 

INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 135 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Gin Asp Glu Asn Glu Arg He Val Val Asp Asn Lys Cys Lys Cys Ala 
15 10 15 



RAW SEQUENCE LISTING 

PATENT APPLICATION US/09/005,318 



DATE: 02/20/98 
TIME: 10:50:21 



INPUT SET: S23610.raw 



Arg lie Thr Ser Arg lie lie Pro Ser Ala Glu Asp Pro Ser Gin Asp 
20 25 30 

He Val Glu Arg Asn Val Arg He He Val Pro Leu Asn Ser Arg Glu 
35 40 45 

Asn He Ser Asp Pro Thr Ser Pro Met Arg Thr Lys Pro Val Tyr His 
50 55 60 

Leu Ser Asp Leu Cys Lys Lys Cys Asp Thr Thr Glu Val Glu Leu Glu 
65 70 75 80 

Asp Gin Val Val Thr Ala Ser Gin Ser Asn He Cys Asp Ser Asp Ala 
85 90 95 

Glu Thr Cys Tyr Thr Tyr Asp Arg Asn Lys Cys Tyr Thr Asn Arg Val 
100 105 110 

Lys Leu Ser Tyr Arg Gly Gin Thr Lys Met Val Glu Thr Ala Leu Thr 
115 120 125 

Pro Asp Ser Cys Tyr Pro Asp 
130 135 



) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Asp Asp Glu Ala Thr He Leu Ala Asp Asn Lys Cys Met Cys Thr Arg 
15 10 15 

Val Thr Ser Arg He He Pro Ser Thr Glu Asp Pro Asn Glu Asp He 
20 25 30 

Val Glu Arg Asn He Arg He Val Val Pro Leu Asn Asn Arg Glu Asn 
35 40 45 

He Ser Asp Pro Thr Ser Pro Leu Arg Arg Asn Pro Val Tyr His Leu 
50 55 60 

Ser Asp Val Cys Lys Lys Cys Asp Pro Val Glu Val Glu Leu Glu Asp 



65 



70 



75 



80 



RAW SEQUENCE LISTING DATE: 02/20/98 

PATENT APPLICATION US/09/005,318 TIME: 10:50:24 

INPUT SET: S23610.raw 

Gin Val Val Thr Ala Thr Gin Ser Asn He Cys Asn Glu Asp Asp Gly 
85 90 95 

Val Pro Glu Thr Cys Tyr Met Tyr Asp Arg Asn Lys Cys Tyr Thr Thr 
100 105 HO 

Met Val Pro Leu Arg Tyr His Gly Glu Thr Lys Met Val Gin Ala Ala 
115 120 125 

Leu Thr Pro Asp Ser Cys Tyr Pro Asp 
130 135 

INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Glu Asp Glu Ser Thr Val Leu Val Asp Asn Lys Cys Gin Cys Val Arg 
15 10 15 

lie Thr Ser Arg lie lie Arg Asp Pro Asp Asn Pro Ser Glu Asp He 
20 25 30 

Val Glu Arg Asn He Arg He He Val Pro Leu Asn Thr Arg Glu Asn 
35 40 45 

He Ser Asp Pro Thr Ser Pro Leu Arg Thr Glu Pro Lys Tyr Asn Leu 
50 55 60 

Ala Asn Leu Cys Lys Lys Cys Asp Pro Thr Glu He Glu Leu Asp Asn 
65 70 75 80 

Gin Val Phe Thr Ala Ser Gin Ser Asn He Cys Pro Asp Asp Asp Tyr 
85 90 95 

Ser Glu Thr Cys Tyr Thr Tyr Asp Arg Asn Lys Cys Tyr Thr Thr Leu 
100 105 HO 

Val Pro He Thr His Arg Gly Val Thr Arg Met Val Lys Ala Thr Leu 
115 120 125 



Thr Pro Asp Ser Cys Tyr Pro Asp 
130 135 



RAW SEQUENCE LISTING 

PATENT APPLICATION US/09/005,318 



DATE: 02/20/98 
TIME: 10:50:28 



INPUT SET: S23610.raw 



INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 119 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Glu Gin Glu Tyr lie Leu Ala Asn Asn Lys Cys Lys Cys Val Lys lie 
15 10 15 

Ser Ser Arg Phe Val Pro Ser Thr Glu Arg Pro Gly Glu Glu He Leu 
20 25 30 

Glu Arg Asn He Gin He Thr He Pro Thr Ser Ser Arg Met Xaa He 
35 40 45 

Ser Asp Pro Tyr Ser Pro Leu Arg Thr Gin Pro Val Tyr Asn Leu Trp 
50 55 60 

Asp He Cys Gin Lys Cys Asp Pro Val Gin Leu Glu He Gly Gly He 
65 70 75 80 

Pro Val Leu Ala Ser Gin Pro Xaa Xaa Ser Xaa Pro Asp Asp Glu Cys 



Tyr Thr Thr Glu Val Asn Phe Lys Lys Lys Val Pro Leu Thr Pro Asp 
100 105 HO 

Ser Cys Tyr Glu Tyr Ser Glu 



INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



85 



90 



95 



115 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 



SEQUENCE VERIFICATION REPORT 

PATENT APPLICATION US/09/005,318 



DATE: 02/20/98 
TIME: 10:50:32 



INPUT SET: S23610.mw 



Original Text 



SEQUENCE COMPARISON OF J CHAIN PROTEINS AND DEDUCED J CHAIN SEQUENCES 
FROM SIX ORGANISMS 



10 20 30 40 50 60 

-1 x X X X X X 

QEDERIVLVDNKCKCARITSRIIRSSEDPNEDIVERNIRIIVPLNNRENISDPTSPLRTRF 

-DENERIV P-A---SQ V S M--K- 

D--ATI-A M-T-V P-T V RN- 

_._ST Q-V DPDN-S T E- 

EQEYI-AN VK-S--FVP-T-R-G-E-L Q-TI-TSS-MX----Y Q- 

— M-T-V-A--RGTR Y— N---K-G NQ- 



70 80 90 100 110 120 

X X X X X X 

VYHLSDLCKKCDPTEVELDNQIVTATQSNICDEDSAT ETCYTY DRNKCYTAVVPL 

T ED-V— S S-A NR-K- 

V V ED-V N-DGVP----M- TM— 

K-N-AN 1 VF--S PD-DYS TL--I 

--N-W-I-Q----VQL-IGGIP-L-S-PXXSKP-dE —TE-NF 
PS YEDGV ET- - - YP-QGVPQS - RD-CPEL VL--P 
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X X X-- 

VYGGETKMVETALTPDACYPD HUMAN 

S-R.Q s— - BOVINE 

R-H QA S— - MOUSE 

THR-V-R--KAT S RABBIT 

K KKVP S--EYSE BULL FROG 

G-T QN EARTH WORM 



Fig. 1 



